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£fcf2D8 7 2 5 8* W^SSrSmU ffi*K£ig?R-t- 
5^&-C*fcS 0 #3gl£cDJ3iJCOj&Blfi, PSPUKlt 
D 8 7 2 5 8 £&&S^tt<afc 5 

10 <0-?>-?;l>'P<DP S P 1 tfcliD8 7 2 5 8 #g»ffc$r& 

?IJ#-5§- : 3 2, 33, 34, 35, 36, 37, 38, 
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lit FPS-H ©OT 5/®?269-413Sr 
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7 (Ohno. I. P>, V-V/<^S1£§fD8 7 2 
5 8 (1996) ) Sr-g-^TL, Lipinska£>, Nucl.Acids Re 
s. 16:10053-10066 (1988) KIE4!c£*v5l' — • a y (E. 
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IE?0#^ : 2 3 (735? 9 K6 0 3-1 9 7 
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yiJttEfll** : 2 8 td^-f. roEOT©=»- K{b*« 
50 |±, iS?IJ#^- : 2 8 (O^ ^ K 6 0 3-1 9 1 34 
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=fiFLT&S 0 P SP 1 -4 =— Kfl5K4&tf>4 3 6©R3? 
T % S gfiayUSrSEJiJ*-^ : 2 9 td^l". 

[0015] D87258 E?iJl4E?>J#-5§- : 171;$ 
•T. r.<Dga^J<0 = — K-fkffl*tt, e?ij#-§-: 17C?^ 
U-^K4 9- 1 4 9 l&^TL-C&S. D8 7 25 8© 
a— Yikm%!i<» 4 8 0 OC^igT' 5 / ®>E?iJ£E?U#-§- : 
1 8 \Z7sk-f- a BE?iJS^- : 1 7 iC^UitD 8 7 2 5 8 S5 J"J 
141 3 2 5 (G/T) K-C^fflttM*** 
U 2 13 (g 1 y/v a 1) roftlt^fx.7$y8 
glCfcSo ^— ^/O-^SteS^-D 8 7 2 5 8 (1996) 
<Dffi?iJ»i^»C 1 3 2 5 GtIE«-r6, 1 3 2 5 TSr^-f 
■5tr8L# y * * U'*^- K#S#I4, *w*b»k:*sv*-cd 

8 7 2 5 8 (1 3 2 5T) £$fcU 2 13T^^(SrW 
■*-**f»lC=»— K**Ufc**ttD 87258 (1325 
T) #>v^ft-eS>5. V'*?' KD 8 7 2 

5 8 (1 3 2 5 T) *Jj:tWLK: = - K3*t«# 
«I4, *W»»teB«3:fr5ff*»ifcJ®lfc. te/S*J.fctf 

[0 0 16] *IJHB»-efliV*5 r«6figft»f>T-J &5ffli§ 

^sricstH-s. p s p i (Dmmmmn<om\-t, Mt'pf&m 

I4#y — t?M*»SteJ:Q«Mi-*r±C, Zcottft* 

-5 rfctjjRj /«£5ffli5l4, *«» LTV 

J: tf / 4 »4lBaSttttiM* JlttEJfc & 5 1 * C c -f , 

[0 0 17] sMUHSfffll^-S rilfh-7"j ftSfflff 

14, ajn*fcttW*fttt«c»^-a«tt^-*-S''^^^±«)" 

*-f-5St#, ib*j;V-o ($£i4-engjLh) rot («y* 
kmMZtlZo RNAJKy^7-W*-OmRNAl: 
-ft<T> = — Kfl5E9iJ|nft*-r5T 5 0>jK 
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'<#JK:S£i-3IR0, =- KifcE?'Jl4E^lciIi8£LT 
v<5i&gl4#V\ 

[0018] iW&JW-cffli^a ria&;Lj sKy-t^K 

14, fitftjfcD NASSAU: J: 9g£Lfc. i"*t>t»a*L 
v^y^^KSr = - K^S^HttDNAfliaSttlcJ: 9 

(«*.f*:/9*5 K, Sfefe#, fr-f^*) 

&mm&-em^z r^tf—i 14, sij^DNA-t^y ^ 

<Dm±%lP S P 1 B?U«rMU StfifflEWPU:* * u- 

20 f\,i 0 

[0019] *wmmx-m\,^5 rg«t£jij ite^i4, 

*9\o*1- Kam*J«tt^/4fcl4B"Je?sfe«t0 5 /4fcl4# 
firT-ttt*** P S P 1 Efl'lr-t*t5. *WlB»Tffl 

i4, ®^/<epa!sa?ijco^Ticfea^, TKy-i^K 
^v^a r^n*— ^-ejijj 14, jfaaa^T-RNAJKy y 

30 7— t?ld£3-U Tffi (3* *Ir]) <0 = - K-fkE^lJcola 

tJEHtSlWt. 7'd^~ ; ?-E?iJ|4, 3-Kft©( 
*' oanWBIItt-=» Ki' (0iJx:l4*ATG) lCi0 3' ^JST^ 
•g-U */MB«>ttX£ft:l4/<9'^^?9>'Kt:IB^.«M 

•raipiijioe (5* id#«t-5. 7*n^~^- 

E?iJrtlci4, (EVMttSMK <* ^ ur — I? SlcvyiJ 
^^-cMK£ftlc£l|$nS) fciVRNAJKy > 7— fef 

40 W) zJSfoa. Jttt±*^B*— #-|4, V^ot-Cl4/«CV> 
zJS, LI4LI4", rTATAj #y*^*J*Jtf TCAYJ 
»Ky JR«±4fc7'n*-#-r4-l 0*1 

it/- 3 5 = >-fe>-t>-j*E?iJ{caaxT->W 

/uy (Shine-Dalgarno) E?'JSr'i - ^r - t"5. 
[0 0 2 0] ^ifffflV^DNA r©J^lffi^Jj f4*g 

swicrn*-#-Ey«, y #y-A*£-&gBffc, ^yr 
. ^-^{km^-, ia^<ifeitE?ij, ±.mmmm&. 

50 *ffl#TfflV>aJ:5lC, RNAsKy y 7— ^2JS7*a*-# 
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-g£?iJ«£-g-U 3- KfcE^JSrmRNAKiiE^U A> 

fcajftj 14, fl-EteD n AE?iJK i <o ?£®$s& t> L < (4 f- 

[0 0 2 1] *P^iW#-CfflV^-5±p{Cl, ^7l»5^HttD 

tc«t>? r^fce&£ix-ci^.5j „ ^HttDNAtt. itesa 10 

14. ^H&DNA<4^7^5 K^^aitfy-A 

S5fH-hK#*?£*tf#5. JCtftlMSKiHLTI*. ScftL"C 

4 >"*fctt*HttDNA«r*^fi»*MS0> 

[0 0 2 2] ^af-?fflV^, rh 7 ^7i^i'3 20 
>1 *fcf4 rhyj^^is* h^tt/tj itt, MAW 
JtettroDNASr^!? >2^, ^*tt«ODNASrJftfe^iia. 
*i&tfiag«ri§:*-J-5. h7^7s^i'3y|iei|x.tf 

A'* Srffl V *-C D N A &i(Q1&lC&i-'( > 7 * * -> a >\z. 

«t9iS^-e^5, *wtt»-efflv^« tm&it®?foi f4, 9J 

f4, *-WiSifla*fcl4^r*^SHc: < t5*ii«07feffll-*3tE 30 

•r-sifaflawma-cfos. r-tr^-r^j 14, £-<<ota:tt 

[0 0 2 3] #9i$fl«-c;H^5DNA«g!$sj© rgs 

v\ »JODNA»-^rt*fcli-t©^JJ:f**UfcPIJE?r 
fgncDNA-fe^^^ h-CfcS. (fot, ASttWtt^it 

ie^77^v^L)i^DNAia d 77^^^ • 

Sit*. JlSltta- KftEyil©»J©«H:, =»-KflsE*l 40 

[0 0 2 4] *&W<D — 3<Dfig«f4, *gg$^fcPS P 

WteSHHOEJUTfca. mst^y u*?- KEWiS 
*IC**4*ft=T-C. E*l** : 2 3, 24, 2 6 L 50 
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< 1*2 8{C^^y 4r^XT#SA\ */tf4^*L&i5E 
?"J#-S§-:2 3, 2 4, 2 6 t L<f±2 8 5 
a»\ *»L<ttaaEK«*4*f*:T-C, E?'J#^: 2 3, 

2 4, 2 6 tu<i42 stc^-ryy y-rx-c#5E^Jt- 
Bliltt"e*)5DNAEW*: = -Ki-**fr» ¥8£**u 
.^KEWttSIKWteSJUHL-CV^*. 

ioo25] mmzmi&te&m fiss&iaaflFdn 

SfflffiT**)!?, 0iJx.liSambrookP3, Molecular Cloning: 
A Laboratory Manual ^2KS, 1#101-104H, 3 — /U K 
• 77"y y / • / w?- • 7#7 h y - . 7*1'^ (198 

9) IdfE^ttTv^, &KlcJtttt&&tt*jilv*3£09 

W4M7*i)/'f-t- ) /3l' • S = — ^K4^Tcoi: 
*Jt)-CS>5. =■ hc-fe/uci— jx • y^jv? — Sr6 X S 
SPE, 5 Xt^- (Denhardt's solution) 
mf&l 1 fefcfl 7-< 3— /V (Ficoll) 10g, BSA 

i o gfeio^y f=/ufa y i o g) , o. o 

0 5% SDSJJiVlOOfig/ml tRNASr^f 

t5g(S i t'6 5t;T7'i'/^7'y^xt5. ^7"y 

«IK^ (0J*.tfBiosTAG- I T (g&i£«0 

fcVWT*!;^*- fa^6 st-cisi 8 
^pfflHlfi-r5. SrgfeV>-C-2 X SSCfciO! 

0 . 5 % S D S (Djgffc-e^ia'C 1 5 5>IW.t?2lHliSfef#-f 

tiss^^ y— vc-7oic-c— gfex^7w , /wAj-8SK-r 
s. 

[0 0 2 6] l^^tfeDNAE?iJf4, E?"J#-§- : 8 , 2 
5, 27fcL<li29, * Jt tta«fCJK***flsT'CE 
W#*:8„ 2 5, 2 7 fcL<«42 9{C/W^y y-CX 
■e* SEMICJ: 9 =- K$*i-5* ^stfgtm— ©7 5 
y®?Sr3— Kf -5*s, igeW3— Krol^gHtWfcAtc, 

^|3 K>UUC*J«t05UUUJi*lciT5 y&^i-A'T' 
73.>Sr=i— KL, — 7JGGX (X = U, C, ASfcli 
G) WE3OC03 h"^te±X f]) K-f-5 0 Sfc 

SIJJC, ^@WtC3g{a-r5EJlI{4, **J7 0%, »*L<(4 
*^J8 0%, &>i)jif*L<li**l9 0%PSP 1 bXfXs* 

f- K«>iai— it***-*"*. pspigMtts?^ 

i"**>%±>c * i"*^ KEMro^ft < lift 7 0 %^5fa 
?iJ#-§-:2 3, 2 4, 2 6*fctt2 8 iCjg-^-f*^, 
PSPlffiMtt5^^RS: = - Ki"5EWItt, 
ffiE^E?iJ#^- : 2 3 , 2 4, 2 6 4fcl4 2 8»C|^fC6<j 

ica«i-*. *Kttfc«tH"fajK*i'**-KE?iJtt, 

-cyy^-f-fef-v'3 ^{CtfJ, SfcliE5«it««-J:DIBI 

[0 0 2 7] *89i0>¥HUK y 51 ^ u^-f- KOAfleWt 
14, DNA, y/ADN A $5 i R N A A 5 9 , L - 
<{4t NS3tE<OtWCfc*. PSPl#V/^SSr = - 
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J ASfclic DNA^^ 1/ — STT'ci— 7*-f£>. fi"Jx 
fCurrent Protocols in Molecular BiologyJ Ausube 
lib G^) . ^U-i'-^!ly/'^-7yv'X->' 

3 v • TV K • '^3 — > • M • W V^— iJ-^^v 

=11-3-^, 1989. 1992Sr#fi5<Dri: <> alftOS 
<ttl5 OifiS-^S^ > KSr*T L 

•r/<C5ia?IJS-^- : 2 3, 2 4. 2 6 t U< Ii2 8 tfclJ 

L-rv>5 a A^-S^n— 7*&, 7"n — ^ (OWi'&i^BiK 10 

fcfi^cofth.WSi%tciJtS-t--5 P S P l^V/^fC 

4 = - h'tSy/'ADNA, c DNAjfcttRNA^ V 

*aSSriaiigH?iJ0>J^tf 7r 5 y 

HKfc *tf*Wft»fcBll*i-* =»- Kfl:E*!H=«H-*-« 

5 ' '*Jj:Ut/-£fcl4 3 ■ «**»6><Oiaj«»»tt*JEflll* 

[0 0 2 8] *38WoaiJ©H8«l«:» y I' 

KWKSHHrojKy^^Kt?**. *^^ro^ffitt^y 

'<~? : f~ Y<T> :><Z>A##Jf2. EJlJ*"^" : 8 . 2 5, 27 

* fcti 2 9 ic^-rr ? y mtntt-tz pspi^^< 30 

JIKWKiWffilO* y ps 

P ifiStSfcWU #5 5 0%, »SL< fi#J7 0%, Mci, 

L < \l.m 9 0 % P S P 1 t IHMf fcT -5 V HfcfcAtt-t-' * 
£#y ^T"?- KE?iJ, EP^>P S P lStt^tSJK!/^ 

4, 26*7t{*28 iClggl L. ,EW0^7 5 / WJrtt 
< 1 1*5 5 0 %^iS?"J#-^- : 8, 2 5. 2 7$fcli2 9 



o^-^v^v^ftAtf/SifcttEWJfcfft. 40 
• ^yu— 7 f *»6*iffl^rnfi<cGAPr/u='yx^ottfflii 

[0 0 2 9] *!6BJWSlJ<Ofll^li. *«WlC#WMCP S 

«i, *^<D®{-it*i£i;i | 3^$ttfcPSPi^w< 

rW^V^^Httffi^JS-i- : 8, 2 5. 2 
7*fctt2 9K*-fT5/BiEW**rU la]-<0&«6£ 
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[0 0 3 0] y 3H ^ K. i: 9 bltDNA 

mz.&mBmmzmm$nzmmT z z. t k «t 9 , asm-* 

§<0^ifelc« ^^#ig*PO*-)5teT-^A-r-#S„ Ausu 
belfe. (JbEKflffl) Mw^i. »S!tUWBLfc 
MS l_v^v^K<£>=i — KrtsEyJS:, ttS^BSfc^ 
— SfcliU'T'y =» o — ^ftST**. 

= v j/ffl <o*a&x. d n * * — *s i tWBSt<E*-e # * 

y : E.coli) . p BR 3 2 2 K — • = y : E. coli) . 
pACYC 17 7 •=!).: E.coli) . p K J 2 3 

0 (y^&m&jfflM) . pgviio6 (^7A(^t^ 

«) , pLAFRl (:/7AP£tfeSffl®) , pME290 
W-.aD : E.coli^K>^7A|§ffiWS) , pHV 

1 4 (W— • 3 y : E.colifciO 5 ^/^ : W?- 1 ) 

X : Bacillus subtilis) . pBD9 : Bacil 

lus) . p I J 6 1 h U'T' h : Streptomyce 

s) . p UC 6 h 1^7* h ?-ir^ : Streptomyces) . 
YIp5 (fyAn^t^ : Saccharorayces) . /<#s. 

(baculovirus) SAiWJta^, Kny7^5 
(Drosophila) M;*.^. YCp 1 9 (*y*D?« : 
Saccharorayces) * & P S V 2 M9LI&4blBi 
lb) i*LfeKIR3&*-St«)-ett*''\ — « 

fityCtt rDNA^n-=V^J : I&I I Clover^ 
a.'.IRL.7'^-*y^^7*-'K (1985) (198 
7) ; *5j;UT.Maniatis£>. TMolecular CloningJ = — 
;uK • ^7*!) ^ • ^-/<- • 7*7 h !)- (1982) & 

[003 U &m?tefflm&m. mz.n^° 



3-K+5DNAE5U*:. 

i ^ =- KibEWtt*-©^^ K*fctt y - 

^Ktt, «*.tf-f— • =»y (E.coli) tac7*o*-# 
-SfcliAite^ (spa) 7*n^~^-ti«tmf#E 

w*fflv^-c»sfc# a. y — EfJi±aiR^7*ci -t y 

*flB4431739* ; ^4 4 2 5 4 3 7#*Ji^4 3 3 8 
3 9 7-§-Sr#fi9cor 

[0 0 3 2] flJ^EJiJlcAox.. *iJBiao«*l=ff»L. 



(9) 10-117789 
15 16 

x9>'*9^m&\<o&&&mmx'Zzmm&.m*M*.z> ?■ K-g-zsic-t *o . gear's y&E?ij£fcii a ftoi&e^ 

C0*SS*LV\ iaa5ga?iJlil^#{CJl*DT'$)5. jtfe^F- OTDNAE?lIKa&t-57Sy&Eyy&ffl^TS£T$ 

<n&&&ik&to3itzteitomftmm£&fcisXX'< ytt 5. z<oj:?tj:m&\te)£m&\zmtoxhz 0 *&w<n* 

[0 0 3 3] 4$£<on- K{kffi?iJS-S^/j:saSBE?iJiSrfl- fflf # -5. # y * o— f-/H6Mto*a* LvvJ&-g\ il»J L 

* LV\ a^*El6j-«WflPEWlciBf^+*» [0 0 3 6] V'<*fftt£treftbOft&J)!( 

Sfcft!l}5ll*tJb5. M*B5M8j:t«l6©|«»EM SBlcijg£-C-# -5. '^:/y K— ▼ISWSrJBv**— «W 

IE. =-KflfflMtt^T»{ClW«E«*J:t«ia*WBR«i! 20 9, Jfcl«^filSDNAt-©B y ^rolgft^f 

{fc^-^Lt^S^m-^^-KiSSftl-^n— Wbt? lB&*.1tftx--7'aT<( /U (Epstein-Barr) ^-f 

[0 0 3 4] PSP 1 *:"-^fC<Z>3§&£3*fctt?g<a ttfP5r btfX'%5. 0tJx.tf:M.Schreiere>, THybrid 

ffc&S:£-f SOiSaS LW^-g-fc&So 3S#»^S*fcli oma Techniques.! (1980) ; Hammer ling W rMonoclon 

3Sfil<^tt, ^^<^@^=i— K"f <E>E?IJ<E>— SB<OBiJ^ v al Antibodies and T-cell HybridoraasJ (1981) ; Ken 

EJlJ(D#A*3«ttJ J /*fctte?lJ(*3«Jl *fctt-tn£A±W nettfc, TMonoclonal AntibodiesJ (1980) ; *S.fctf 

?^l^tf K©t^i9, *BttfflFJIf4 3 4 1 7 6 1t;»4 3 9 9 1 2 1t;4 

EW*rlM»i-att»« #Jxtf&&*tr6]tt3iS8&&:S&3S 4 2 7 7 8 3 f ; S&4 444887^:^445257 

14, S^#lc:JS*p-C-*>5. 0iJ*.'tfT.Maniatis6, ±I2T* Of 4 6 6 9 1 7f ; g44 7 2 5 0 Of 

; TDNACloningJ I&I I ; TNucleic 30 4 9 1 6 3 2f $J,tU ! l4 4 9 3 8 9 Of Sr#iiZ)C 

Acid Hybridizatlonj ±e?3lfflfc#J80>£&. i:. Btt»ttJIJ(lC*tLT«feLfc*y ^ o— ^-/UttfleO 

3) , ^^nnBas. -imtir (cos) *us. g«o*y#B- 7-A^gtfle«r3— K-raafirT-ttSiR* 

X- /v&*?-*pjft (CHO) *asa, Koy?^? (Dr CJIIfclOPCRJHfflCJlO'WVy K-^iSHMSr? 

osophiia) *fctt*xsL-iaua<»a«fca. ****** K*n— v{b*j±tw6»-e*«. * 

-^^^-^^^^W^ ^ -r S ^, * W-<-?-ft-teig- g ^rofeifc ti:, ?1ty jr n — r^X-i * * / * a— tfrX 

«*»feB:*ftK»K-C*5. fi>'<?K#&1&Lt£W 40 t. <il»JTr*4* R I A, E L I S A^(C*JV>-C5* 

Jifflfla^W-^— h*»6iWBt-*-s*>*fcH:*eiafll»Hj»» IHi L-cfflv^5r fctf-cs 3 fcv* 5 jft-eS 

[0 0 3 5] *3g|q«?^ V^^KSrlSJ^-f 5fc»©SiJ05 "C?t 5. 

JMfeHC -f--3!) (E.coli) Sr^@$Htfe-r-5yt*(ci# [0 0 3 7] t h £M-<0®Vo<D BT*«W*s t h^ttfl|« 

P>4x7t^ B-^Srfflv^T®g^7'r^7y-S:lll^L, ictt^Sfcttjtt-S" L"C^-5=¥> («*.tfLiufe» P 

P S P 1 {C*J"f-5#y ^ ci— ^-yWjfiLftSfcli-ty ^ n— roc. Natl. Acad. Sci. USA V 84 = 3439 (1987) Sr#fi8«0r 

- = tfcJ:*. *»Wro^W<^»4*fc» ffr*ffl*y^ci— #-A/JJtflEI±Johnsfe. Nature 321 : 

{b^W-S-dJife, «*.tfi»'<7'f ! -K*ril»"e«>H«'<7 p 50 522 (1986) ; Verhoeyenfj, Science 239:1534 (198 
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8) ; KabatP>, J. Immunol. 147:1709 (1991) ; Queen ^^^l* Z.tl(bW&&TZ t>COT'ttfcV\, 

<b, Proc. Natl. Acad. Sci. USA, 86:10029 (1989) ; Gorm 'j®f&&Wg\Ct± P S - 1 , PS-2, APP^tli^W 

an ig, Proc. Natl. Acad. Sci. USA, 88 = 34181 (1991) ; & ttiL<DmW%H> i &> 5. P S P 1 * fcttD 8 7 2 5 8 * V* 

imiodgsonC Bio/Technology 9:421 (1991) KfEfe *Stt, fmf&tii&'g&fcit'ttlb 0 P 

$1X5 «t 9 Id ftHkJ £ft3<0;45#*LV\, «W« SP1 (BB?iJ#-S§- : 8 , 2 5, 2 7$fcli2 9) tL< 

SiJ<Z>flg8ltt, ^m^V^f- KSfcttD 8 7 2 5 8 (iD8 7 2 5 8^^S (gfi?lJ#-§- : 18) £fcH-£ 

W^x^W- 5. 4&«K<fc-5PSP l*fcttD tte>W«fli6<J»f>i-W7'§y8lgH?iJSr{i^T^.5„ ft-fr* 

8 7 2 5 8 roatBftMffiKltt. «MB0>«|S:&MIIMF;4>b% £, P S P 1 SfcttD 8 7 2 5 803R/aSK«^-fl:«r» 

tt£tfm&0>**tt«tt<*>S. \s—9—<n 10 liD8 7 2 5 8 fSttSrfflS-r-S fcffiffl£i'i.SKlk<IMC& 

KIK<H4Mr«0*$ttftMfeJ(«tf fcS. [004 0] 4:«m«>a>J9tttRil. P S P 1 SfcttD 8 

fau- *-tt, AD^(OWSiI{kw^Troaicf&«e*i 7 2 5 8 # w^^KlcitasttKte^f 5- fcKJ: 9 . P 

*fcl±T-Wttte*ffl-C*>*. PSPlSfcl±D8 7 2 5 SP l*fcliD8 7 2 5 8<DgttSrlSa5i-5^W<0#ffi 

[0 0 3 8] #f§f3H<OBiJ<E>ftgEStt, #389!tf>:#y ^T*^ Jf^SCttLamk, Nature 354:82 (1991) SfcttBurbaum 

K(CiS'g--C#5ia?IJSr : fl'bt , /j:5r>^-feV^^!; =fp? <b . Proc. Natl. Acad. Sci. USA 92:6027=. (1995) KBfljj* 
KTfc*. -&fi!c^-y K*fcttHii 20 **i«ttflHflwJ:D, Hfls*«Pflc±-C*j«U H#3E» 

■t-STV^-feV^^WfllJBttSWHfftt, PSPlSfc {«ctt*Lfc*^=u-*-IS^Mt*«f*-rs. 9M 

{ID 8 7 2 5 8 = — Ki-5«fttf[|fe«rBtt fcPSP l*fcllD8 7 2 5 8*^1114, PSP1 

L, **U£»*Wtatt£ , U .lE^fciaJh-rsaiKliaW-C (ga?iJ##: 8, 2 5, 2 7^tll2 9) fcL<ttD8 

'*SCi:tt, 3JR#(aatt&ftfe£. -J&ftldtt, Cohe 72 5 8^V^g 0EW»* : 18) Sfctt^ft fc© 

n.J.S., Trends in Pharm. Sci. 10 = 435 (1989) fcitWe «tfilWSI*ff«?r 5 y IfeiH?lJ£Hi8;fc-C<^3. WRSrtfJ^ 

intraub. H. M. , Scientific American, 1 990^1 J! , 40H "T-St, il:&ttt£ig^i-5&3fc£fctt«fe$k£K 

*#js«>;i t. *58W©»Jott«»4, p s p i ifcttD adtttrw y h-^fctt^tr v—-?4>rm#i> 

8 7 2 5 8;? W<*Jt©lMStt»av<- 9 , j«9I*lft{*»C J: 0«ffl^riBT»fcS. HfttfcHPfleiC** 

(C^-r-SwirtClii?, PSPl4itliD8 7 2 5 8^ ^Ufc^f a U-^-iitlS, feiVf IftP S P 1 
W^H«SB*Plffl5-rS<KlJf»#ffi»-MLT«fl!lSrffffi 30 SfcttD 8 7 2 5 8 * >'^ff ©jg-g-i&tt, PS PI* 

i-5fcfeW*-j£-Cfe-5o *-? f 3.W'— .W0iJi LTtt, fcttD 8 7 2 5 8 # y^^R/^fa 

-Sri*, ;ift&KR^*t>09-ctt&v*. p s P l £fctt * 3E»*ttiB«o^riStta»SfbP s P l *fcttD 8 7 2 5 

D 8 7 2 5 8 * V/^Rtt. *fiffi»*'<- ht-*fc 8 9 *'<9%t*h&1*t*>, Hff3t»ffK:«r*Lfc«* 

tt^n^^/SJSMffcSr^? P S P l (5B?iJ#-§- : 8, StoSqSEliHL-CT y-fe-f Sr^llfii-S. &IB 

2 5, 27*fctt29) tl<liD8 7 2S8?^ fc^ V'/<^Ki«**»fi!i-#-SBIfrX»fls«r'lt«U *■ 

ft (E?q*«: 18) *fctt*ft&»«*BW*Wt«>T5 x^. U-^-fgffi&lfrtt, £3K#K:Afcia&flF. 0Jx.tf 

y®?Ba?IJ*{Sx.'CI/'>5 0 PSPl*fcl±D8 TOF-S IMSiSfe (Bruramelib, Science 264 = 399-402 

7 2 5 8itfi 34Btt/'fniatttt'fr' < - h -r-WL-kfrZWt _ (1 994) ) KlJ: P BUgT a. 

'j*'-C S -5 *ftT-e". P S >" f * fc II D 8~7 2 5 8 40" [0 0 4 1] P S P 1 * fcttD 8 7 2 S 8 ffiftFOHfliS'tt 

I**. MOP S P 1 JfcltD 8 7 2 5 8 ^ W<?R A 4 Z.kt**M&tlZ. Z<D&?\Z 

*fcttiiM«)iNBMlffitt^<- h-*--©flF«EU:BBU-Cry HfcSllfcffiR©*^* U-*-l±, FADJJitfAD 

SHEI - *. •Si'MWSili. S&K. P S P 1 *fcttD8 7 2 5 8 

[0 0 3 9] *£9i0>aijOt8!Ktt. «Matt*Ktafff 14, ttZflUB+S^w^KfcWti-SfcfcfcJB''**- 

^^^ftK/ffS1£^i!§.8-re^^K:J:9. PSPiS i*ST-#, ^offSf^ffltttttFOSM^^eQ PS 

fcttD 8 7 2 5 8 * V/<*H<&«1j6*SBSM-S4&«ro# P 1 SfcttD 8 7 2 5 8*5 .tO'-trof&WET-i: <Of£i<n9 
SICK LT±Sit6Srf?fB-r 5*-ffi"C fc -5. u— v^^S-^^-'^^effiSf^fflcoPt^gl-J: 0 , PS 
CO0BJ£ LTtt, ^T"^ K, KaEffiMfcWO'h** 50 P 1 SfcttD 8 7 2 5 8fiH±«:Hfl!i"Sfc*C>OE3RM4fe 
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X.ffP S P 1 *fc»*D 8 7 2 5 8 se^g^/jg-g-x- 
P l*tliD8 7 2 5 8 k1®E.iEm-tZ*^<'<?W&:%- 

[0 04 2] SI© 2 7 V V (Yeast two-hybrid 
system) I*, ^ ^ tr4?^$G^e&{fc^<Of§te£ffl# 

ftli*®**?!^ 5 2 8 3 1 7 3^§-lCBH^<**L-t:J3»). K 

-CfcS. iffl*tc:tft^5i:, PSP1 cDNA£Ga 
1 4£fcl4Le x A^S^DNA^m«ICi»-§-L» 

^e»i*a t pi-^-f swcoiwaai t) A^ufcc dn 

A7-f/7 y — • ^ 14, G a 1 4<D h?>*T9 

7-<<— ->3> (transactivation) fI*K*fcl4SiJW h 7 

v^r^^— ->h vta^ctiiK-g-rs. PSP1 hfSS. 
ftm-z%Z>*-y<?'g.*& : $L-tZ> c DNA* B-y|:i 
9, fe^H^Stt, 0i];tf;£Ga 1 4*i.fctf y 
e^m01J^.tfGa 11-lacZC h 7 >-*.T * 5^ 

[0043] giJC0*fe(4. Igtll, I ZAP (* h 
7 ■> V) 3; PSPli c D N A 

&;LPS P 1 4r>s<*R£fctt*:ii&0*tf-f±, 

Y 9 9, 01Jx.(iFLAG, HSV^fcfiGSTK 

A>-C#5?S\ Sjtttf^syHfc-e**. fiMPSP 

i n 32 [ p ] -e y vswb-cs s a\ £ fcis^sssate l*: t 

©*fflV*T* h U7* hTtfv^fc L< l±#^|C*ti"«l(t 
«ETtffctt|-e# S. 1 g t 1 1 c DNAM7-f7'7 y- 

i4g#j<z>*M&7j>e>f£!9, ffiMPSPii-f^i^- 

hU.MU PSPltM^ft5cDNA?c- 
T.Maniatist), (±C?9lffl) £#f£<0 

-[ 0 0 4 -4 ]- -»JW*fell— -c-D N -MP"<-9-9~ 

y-i 7*^ K7f=Mk«tt«) 

y-=>:/U COSHUia*tli2 9 3ifBJ}S(c:— ^6*JI- 

L<l43 7^{kPSP l**T*-aiMS«rlSi:SU ifcj£ 
U -f^^-hfC, JMHfcUJU * 

-h7-W77-f-r^PSP lS:^ai-5^i£ 
T*fc5 (Sirasfj, Science 241:585-589 (1988) HZ.XM 
cMahanib, EMBO J. 10:2821-2832 (1991) 
t) . Z.o%mv, H Won**** a -Kf 
5 c DNA«r-&W-f--5 c DNA7*— /uSriSgiJ-C^. S #J 
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01Wt"C#. h7^7i?->3 V, ft 

«fc V*— h 7 itHrtr? 7 -f — ©-y--f ^/w^rMEIt 5. 
*fc»J£» I WW c DNAIJic DN A7-f 7*7 U 
RtlLf&Wc: b 7 hU 7'l'-H;^LtP 

S P 1 <r$*-*-5JDl±t?iMa*r^=>'* r i-5i ttc: J; <3 

*«-e#a. *M»*tefl-*-rsiMs*»»u 7*7*5 

KDNA«rHl*U *B«f«P."e*t*U h7^7x^-> 
3 is iTto-y-j 9 <09 ci — v*q#e> 

10 tl Z$.Xlgi 9 t&T (Seedlb, Proc. Natl. Acad. Sci. USA 8 
4:3365 (1987) *J «fc tMruf f o & , EMBOJ.6:3313 (198 
7) £#fiSrort) . |frfr*J"<*tti«5H& 

c DNAI*®<£1<£>7'— y ^jSfcKJ; !3 A^t?# 
5 0 ±St<0— «W<e^^ y — -V'^Srliv Wongt, Sc 
ience 228:810-815 (1985) [CfflTF&tlXI/^Z,, 
[0 04 5] SfcVJO^&fi, P S P 1 

^w<^it«rii»«jiijBia*»e,JMw-a*fe-eji>5 0 p" 

20 SP 1 £GST£fcfi/jN$/£^7°^K*^<kc01S-a-*>' 

P S P l bHE.Wmi-&f-*'*9Kl±* 
ir-Xi>6ttH«K:JgHJL. SDS-PAGEICJ: 

Ft- 457 5 ygJffiyiJx-^tt:. ^^d-> 

30 TI4. j&ftH^SfclHM' h*'(>Mx.l£=-V 
^=7->&ftte'{>9-nJ*>>-3$?t>i$,Z 0 

L, «PSP lttfr-eftftitJR***. *ffittJR*«:7' 

of-/^A-t7r o— ^-ClH]ltZU, SDS-PAGE 

9*«{ku, ^M'T'hre'f-o^lrtiSDsy 

- ^ jo- 9'<omz.m V ^ 5 r 1 75S -C- # 5 o 
[0 0 4 6] ^feicsijro^rffiii, <otiib 

<r>^.-/^ K9W7*7 y — ro*^ y— -^^fe-cfeSo ffl 

**^^k*fc«:««flsPSPlli % PSPltM 

ffl-rs^y^ Ksytiiy h'7^7'7 y -a»e> 

50 m^XKim^LttP S P 1 *7tf*D 8 7 2 5 8iS^' ? — 



(12) 
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I1D8 7 2 5 8 *-&*<Oflq9EO»fiiB 
14, M^^S2-/^^y y K3R (yeast two-hybrid 
system; ^ ELI 5A* 

ffiv^>A>'7'yfe'l'»cJ:t)jSjS"e#S. PS P Ufc 
|iD8 7 2 S 8/*fr'<- h-*— *BEfPfflO»J«*rtt* 

[0 0 4 7] jffffilP SP ltt<t4D8 7 2 S 8*fctt 
h-^— J-BB-r«ryfe'1'tt» 0iJx.tfEL I SA 

fcttStltttiBWISP SPHKI4D8 7 2 5 8 £*fffla 

L<tt*a»»XSlcJ:9aWl**tS. PSPlifcli 
D 8 7 2 5 8 h7--«£fWfl«>»rit*tt#* 

£i±BCTstwiMttf#£-*~sB'S\ iS8ff<op s p i 

t L,<liD8 7 2 5 8 tHtttffl&<Dj&'&s<— hi-— to* 

[0 04 8] #3gl?i<D»JcDjig«fi, 441910 PSPIS" 
fcttD8 7 2 5 8*7^ * — <O^T3ai, JaiVEIjl 
LT«^B3HMt*t?*>*. * 

tK, 0.3%9<J S^SSrtUBi^&ftS. r.n6>«5»ffittftl 

0.5%WT^b. i4>m%T\ 15 

*fci±2o%»iif*-eKHEfl2-e*» aKLfcttjeost* 
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[0 0 4 9] rcD<fc :>tc, fr&Wn*:?^ \s— ■*>— mm 

teft&Mmnmm&im.&vti-i.. i m i obsjbbk. a 

i V*3SW<0 9 1"* 9 S 5 0 m g i b {Cgiit 

(Ringer's solution) , *J itf*?!?!^* 
^ i 5 OmgSrtftSi 5fc«ire*5. 

XJiRemington' s Pharmceutical Science, ^1 Si, 
•*y9 • y*s*s9 • -4—* h> % 



[0 0 5 0] B#tt&fciffL&£&nafME£14feK«>& 

* -C-tiiQ $ -Br 5 r t «r atr. tefig^jJ^g p ftic^ £ ft 
fSfflS-S-Sf* 0. 1~ 1 0 0 Omg/Btfc 
[00 5 1] JBtfCttttlCfctT* *£gli0BJ|gtt&£ 

4-4 u^/ufc J: * — v-e3SSfc-e# -5. v 3 
t, ;WBW«>BJitt*ariMfcfi. ^IS^co^t^ u— 

[0 0 5 2] ^ite j P<Oite^<tt)7lt®r 

PSP l4fcttD8 7 2 5 sae^O^SESSrteiffl 
f*tt> i* ©Sf I- i 9 D N A l"</Ut-^tt|-et 5. ^ 
mm<n&& {tf J ADNA, mRNA$) »4, 

sa, ^j^i^jkfS, b. *fc(iiai^^CTx»i^^ 
srt^-ct, ifciiPCR, y^f— vmmsLfe (lc 

R) » MBlMHi (S DA) «f$:JHir*-C#*rlWK*B« 

J-iiiH"C#-5, fiSJxtfSaiki^. Nature324: 163-166 (19 
40 86) , BejfeT Crit. Rev. Biochei^oiec. Biol. 26^30T-33 
4 (1991) , Birkenmeyerfj, J. Virol. Meth. 35: 117-126 
(1991) . Van Brunt, J. Bio/Technology 8:291-294 (1 
990) %&m<DZ.b a RNA*fcl*c DNAfi*fc|5j£ 

WO«»»ctt*«*PCR7 r 9-f tt» PSPlSfc 
(4D8 7 2 5 8&«B*ff*mS*iJ:l«Mfri-*fcttlw 
W*.tf» M»*»J:t«*Att, E^PSP 
1 *fcliD 8 7 2 5 8y/?^?.iJt8LLT, 

SO DNASr*«W»»^ttS»t:P S P lSfcttD 8 7 2 
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5 8RNAC, ^itsa^amnitM^smmitp spi 

£fcf±D8 7 2 5 8 T>"7-± N AEJlJtC^ ;/ y 

WW:; RN7- l?AfHfl2*fcH:iB»a* (Tm) <r>mz 

[00 5 31 $ fcld, £%ft£Jl& fetf»C»IBitfiHP*i 

**£freffi*fflttKJ:9|Sl:fc-e**. OritaCj, Genora 
ics 5:874-879 (1989) £r#fi3tf>r. £«, 0)Jx.«i, ^ 

-ct>«yiH-a co;!rtfc<BflM;i4. pcr 

CR*ff0fc&att&&fic?ttJtl&i:y'=f** k 

©ttJHICfcO. <*iil.-eS«. E?'J#*§-: 3 2. 3 3 s 3 
4, 35, 36, 37, 38, 3 9 *5«kt) t 4 0 (C^-fgE 

S P 1 *fcf*D 8 7 2 5 8 0#ffltt«rtttU-*-« it l-<t 

T-, PSPl«14 3 5?^l/tf Kt\ *fcttD8 7 
2 5 801 3 2 53** Is* 1 ? F"tN$£il£;ix3<0;4S$r * U 

BBJlJ#-§- : 3 2, 33, 34, 35, 36, 37, 3 
8, 3 9tS£X/4 0a»e>/£5S¥;5>lbj§tRLfc;a* \s*f- 
KEWSr^Ti-S * U**- KT-W PCRia>3t 

[0 0 5 4] D-^B*J&£Af££4<&fiMttlttt:. " 
IMt*K*f#5a\ *fctt#*>*V" k *'A''p©DNA»r>i- 

fcflre**. E2W&fc*DNA»r>m, l^^/^T 

s?3. Science 230: 1242 (1985) £ C, 

EJiJOlWfc. 4: ^SMt^-Mi 
SlScl6«iJ*.tt:^^o~fie>^fttSt*»I-C<0, DNA^ 

fori hit/uto#n'<fi—*olU£\z <fc o t*ffl-e* 
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4. 0iJ;i{;SNagaminefb, Am. J. Hum. Genet. 45:337-339 (1 
989) «r#Mort. W*«*tt«-eoEMSEfl:H:, * 
tysT—wmrym* Wx.tfRN7- ^i^S 1 
(SSI, SfcfiCottonk, Proc. Natl. Acad. Sci. USA 85:43 
97-4401 (1985) |CM*3:h,afc*ttiB*ffelC J^T t 
S%i-r4:3js-e#S 0 iWiplC, #gttDNAE?iJco& 
ttWi, «*.'ttW:/y W<r!/ay .(«ittf^yo- 

£fc Whiter, Genomics 12:301-306 (19 
92) t#Iori) , RNA7-fM (t&J^tfMyers 
10 Science 230:1242 (1985) ) , it¥&)®%f (^"Jitf 

Cotton G>, Proc. Natl. Acad. Sci. USA 85:4397-4401 (198 
5) ) , ESMDNAv'-^syv'y^ SfcliSiiJKgt 

-->• a vris, a; v K* * ^T— -tfMIWMRIfeil 9 tU-fi 

Si4 (restriction fragment length polymorphisms : 
RFLP) ) l:i •} itfifc-Cfr yyADNAOffy 
7"By?^V/*flV't**4 1 0 O&gat 

[0 0 5 5] ii^ , c7)yyl'm^C*l(l*J<tt/DNA->— * = 

0iJx.tfKellerfc>, DNAProbes, .^2i£, Xl j-? h>- 
■ 7"U^, =.3.-3 — ^, = 3 — **N, (199 
3) &&m<DZk. -r^fc^, jUMSIdSSttSDNAifc 
f*RNAE?'Jl*, *«a3±tf/*fc»4Bl^©HJE{b<:-#- 

-<yy^f- is a 1/ (F I SH) %i&ki>— 
30 ^®^$^5*-&-Cfct), F I SH&HLT. #*fc# 
KOjCUktf&Zo fiSJx.«!Trachuckib, Science 250:559- 
562 (1990) , *JiU5TraskP), Trends, Genet. 7: 149-154 
(1991) 4r.#Swrt. &oT, PSPl*fcl*D8 7 
2 5 8 3tfc J F©#&Kg^<;B?g&£fflV>-C, 

[0 0 5 6] *feH:. mRNAK:*irt*jEflsicj:9lfia 

SPI £fcfiD8 7-2 5-8it^***fM4: -LXJ^t, 
40 P S P 1 *fcf±D 8 7 2 5 8mRNA<OIS^W 5 *i 

S^#03<E«ST-*>-5 <> ^Jx.(i TCurrent Protocols in M 
ol.Biol.J I&II^s, • -Of— V^^S 

Ausbeie>. (i@) (1992) «r#^CDri:. T'n-t' 

50 coi j/i/o- y«47'o-^ro|^S«:§^|i-t- 
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-&ift/«e8=IiJi: U-C(±, W^y^f-j/a^ 
[0 0 5 7] / l/t i y 5 P S P 1 S t 

I4D8 7 2 5 sott^stba&wiiite^-f&ie-c-fey, 

ftfc SCriS$:Rfe Srl£?«U JWI!S<D«8tg$:IE#»;: 
U ^o-C^wJm^Sr^ft-t-So *N$#=tf-f4PS 
Pl£fc»4D8725 8 jte^^feS^flg. Sfclirt 
H&P S P 1 *itl4D 8 7 2 5 SWS-tt^rsBtPI-S^^ 

M oJtfi^-fMRH:. ^ v traif * fcfi^ ^ y trtf-e -net? 

{CfeAffi P S P 1 StftD 8 7 2 5 8 ite^fc&AT-t 
5, firlJjLtf, -^^^*n=— SiUS*^;* (MML 
V) tttt0cO3tfi?ttsmiTlc:m^CA*i0>'<*'?— -? 
fc<5o £"J;Lf4*Boris-LauerieP>, Curr. Opin. Genet. Dev. 
3:102-109 (1993) &&m<V^b„ 
[00 5 8] Ztl\Z.lt UT, ■< V fJl«Jtfip^f&*H:A# 

tt. &Siiftftktf>-ef4, JMHw&^-f-sfcftlcfl&tf y 30 

^ (Berkner, K. L. , Curr. Top. Microbiol. Immunol. 158: 
39-66 (1992) (C|E«) t> L < tt7f J HiSfr'f/l'* 

(AAV) $ — (Muzyczka, N. » Curr. Top. Microbio 
1. Immunol. 158:97-129 (1992) *J J:t«fcH#lf$6 5 2 5 
2 4 7 9-S§-{CfE*£) K ry<y4r-^V^j SU©*- 

r^DNA (nakedDNA) J ©S4t?fc 

K> > h li'j: t>\~ KftlURI^ttiRffliBe-?-* 40 

[0059] *^^roiteT-f&SEic«-ffl/«e*flsasi-i±, 
y fflFAitti. ttfffBtt. flumua, Bsroeicw 

©ALTfT? r i/45-C#-5. ^Jxli^H^rFMS 2 4 0 50 
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8 4 6-§-tr#fiaor i:. 

[0 0 6 0] 43gHaffJ0>tttigtt. ffigtf>|fflj!&lc#3§93 
©sXy^^ l/tf K4fcliD8 7 2 5 8&&&+2>ifeti 

f4fi*OK«:*»W©# y 3? * K"C\ D 8 7 2 5 

8 T- * it 14 1 & |C JLU itl5 h7V^7 

736866t;g5175385f;H517538 
4ffeJ:o:i5 1 7 5 3 8 6 -f-Sr^f&W r t. '&<btlft 
h? ^Xi? V fSh&lliP S P 1 £fc!4D 8 7 2 5 8 

*5. £ Ototttffflfch 5 y*»»M\ PS 

Pl$fcttD8 7 2 5 8^^<^KO«9UcHXi-5tt 

[0 0 6 1] ~ 

mmmi- ps-i^<-ht-pspioiig 

P S - 1 T 5 y ®ffi?IJ<D 2 6 9-4 1 3 (OgSS (ejus 
f : 10) ia-KtSPS-lcDNA (i?— WO 
^gS6#-§-L4 2 1 1 0) <E*U#-»: 9) 
^-y =fJ3* K7*7-f^— 5 ' — CGGAATTC 

CGTATGCTGGTTGAAACA— 3 ' (ffi?U# 
1 1) *5«fctf5' — CGGGATCCTCAGGC 
TACGAAACAGGCTAT- 3 ' (E51J#^- : 1 
2) -CPCRtgtlLfc. ££4fefcE c oR IttJlVBa 
mH I T'^tL, p EG 2 0 2 (Goleraisfc, Current P 
rotocols in Molecular Biology N 5? 3 — > • £ -f 

=3.-3-^ (1994) ) ir^o — WkLfc. 
?#P>iXfc7!7^ 5 K, pCC3 5 2ll DNAg^V 
L e x ASrPS- 1 ©7 5 yB£2 6 9-4 1 3 

^<^^— , p EG 2 0 2f±T/U = — /WJBl*iR»SR (AD 
HI) T'a^-— ^ — Srffl^t, LexAfi^V/^S 
fcJ;Vjg)3iJ-v— * — fc LTH I S 3 
*<i>9 — aSdDNA->— ^ai^f— (T^'f 

D, Le xA^7U-Artffi-g-L7tr.tSr5t^.Lfc. 
[0 0 6 2] 2^7*liyK^#y-^l:ffl^t4 
X<Ojj&> 7*7^ 5 K*JiVt*ttGolemise>, (±!5IC 
§lffl) JCl¥iSa(C|2«c$ix-CV>-5. »»«cEGY4 8 (M 
ATa, trpl, h i s 3, ura3, 6 o p s — L 
EU2) 14, 7'7^U'pCC3 5 2fc , J:0 ! pSH18 
-34 -ePIB*»KjE».Lfc. »K«5»ff tt» -7 7 5/yw*J 
J: ^fj?y** < 36±*'^«f»«r)Bv^T»*J Ufc„ 
7*7^ 5 Kp SH 1 8 - 3 414. 8l©LexA^f 
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-^-SB&riS, L a c Z&m+joXXfimm-T—X-b L [ 0 0 6 4 ] %H&m 2 - PSP1 cDNA?n-= 

TURA3fflMS:^t5f'>GAL IT'nt-?- ^^.tt^gS^W 

- P S - 1 &&#<D±&to'&liZl±. L e xA{d*fLTJt xyi/i/^lc^Lfc, fflSfPfflSi'n^ y^^gc DNA 
[Sj-f-isKi; ^n— • ^^fetjkfSSrfflV^T^ettta^*^! W-oODS?iJ^*f^«t 9 , E3*tff-i§-: l©20fl:lTG 
^f^D7h^fflw.t>3J6KLfc. LexA-PS- . GAT'^i!), 5 2 30{!l:lT'TGAt#h>5 1 7 37 
lfi^*t-C'liLEU2fcLa cZVtf-fi-ftZt 5 ./ Ifc dB^US-fl- : 2 ) ^ >-^^®Sr = - K-f<5 5 1 9 
fStt{k-Ct^*»ofc 0 $lb^, Le xA-PS-ljgfc-g- U^K3h-7*vy— 1 — A(I£J?IJ# 
*<D^WSAtg^I*5J:U { DNAM-g > ^tt^7*L's/-> 1) j)S^$ixfc 0 c DNAEJiJifcH-T- • = V 

a >'Ty^r-1'^ffl^^T56mLfco 10 (E. coli) i? y >7v=rT— t?h t r AO— SSfdfSl^te 

[0 0 6 3] LexA-PS-Ha^iVpSHl Sr^-f^T 5 7 ^ffi?USr^-f 3 B L A S T X*S J: 

8-3 4 (CCY3 2 1) £r-g-*rr£tfcli, ^'fT'y!) LASTN7^yXAJffi^fci;-y/ i !y : 5' • f-f 

- • *-Jr— /Hf2@SG&7'n h=t— vu£fflV-C\ 7"7;*5 l±, ifET^Iffl-rSLipinskalb, |CfE® * ftX^Zt (12 
KpJG4-5t"thMcDNA7-f/7y- ?IJ#^ : 1 3*J±V1 4) „ r WSrJS c D N Ali P S P 
nvfy?) -CMjre&Lfco r07-f 7*7 y — 7*7* li»t5„ c DNA53J: t) ^^^SC^^tSfc*^, 
5 Kit TRP liSSiJ^-tf-Sr^TU SV40^S : 1 «0 8 3 - 1 0 6 iggatfcgj-f;*- ]) =r* * 
&'fc;IH3"lJ, Sf?7*0 5/7* (acid blob) B42, SoJ:tJ*^ ^KS" - C T G G ATG GGG AG GT G AT T G 
^^^/U^=^3it* h— 7'^^'^^r-f5*S/-fes' h GGAGTG— 3 ' (g2?U#-S§- : 15) Srffl^T, 5*— 
l-s !&£•#"( A Dgi-S-fc:) i: LT c DNA£r3§5i£-tt- V h 7 c DN A^iSJS'J^x A (3*7*= BR 
5. Gyuris<b, Cell 75:791-803 (1993) &&m<OZ 20 L) SMgffl L"C. ^-/^^ y7"F (Superscript) 
i:. -OiK-g-^coS^Ji, #7* xmmyv*r—fi t HffiSc DNA^W 7*7 y — (=¥7*=BRL) y 
-GAL 1 ro$fJ^Tt-fc5. 7 is ——'yWtz 0 Innisfc, P C RProtocols : A Guide to 
/U. t^9 L ^^fc-<tU'hy7 , h7r>'Sr^<^^:S / >« Methods and Applications, 7*7*5 jx^ • 7*U^, 
HfiJbKl7*L — Hfc, *xJ4.5xlO°<O<B>*CO?f0S^m^* tyf^xrf, * y 7d-/U=7iW • (1990) iJia^anibro 
^fetu, 7*— yUL» S[ig$-«:fco #*<£>£^= n=— £• ok<b, MolecularCloning : A Laboratory Manual, ^2 
SS'J C PICSS7 , L — bZHtzZ. &<rfMB^S)t<!>lc. 2 Jig. 3-;i/K-^7"y^-A-/<-.^^ 

x lO" 7 ^;® J® (<B*0^fCG5&#<o;i$3fflF<0$c) £^7 K'^7!I^t''/w<-, - a _ 3 _^fl| (1989) Ic 

•>'K fc**-5?:^ hy7 , h7r>'*JJ;0*n^-»'^ |5«J$ix-5 <fc ■k.l&mi&PCRt.tiltmmitotl:'^ 

ADJK-g-ro^Sr^t-S/tfecO^^aSi: LT*7 7*y ^ if- -> a ^3kVr*m^X^ y — =. 

* h-^/77^y-^4r^tpg'>^fia(C7'U'-J>L 30 >^Lfc. P S P 1 Sr-^^f-rs^tl.f)CO^m±MRS^ 

&<7 7v'/l', fc*7^ ^*3±U5 h y 7 , h77 ^Sr^<^ — >ffc5EyiJ#-§- : 3*5«tO*gE?lJ#-f|- : 5 

^t53 D =:-|i»:IC#7^ h .tC^L a [0 0 6 5] E*iJ#^§- : 3 OEJ^ffK «fc *) , E?lJ# 

cZM^Lfc. #7^h-^feff?)»fLEU2 f : 3©l©{£gT-CCCT?JSS t). 9 7 2WftfCT 

fci^La cZV7ii-*-<DffiJf$:fgmk-r5Ztlt><0 GAt*iD5 3 2 37;y8 (E7<l*-g- : 4) ^W*^ 

¥8S#li, BH4i:%*.&*i. $ kf£S*S*£3&feLfc. 7* S*a- h'-r* 9 6 ^**f- K*— TVy-rw s 

7*5 K£g#«:ri>lb*||ftU -f— • =y (E.coli) KC ^7U-A^$nfc„ ga?iJ#^-: 5©ejflJ#*ffcJ: 

8&&MW$x&-fZ<n{Z.mi<\ h y 7*b7r VSr:fc<& 9, ffiJU#-S§- : 5W 1 (Ofi@fCTTt?te4 9, 127 
(E.coli) itflLkT-J&SS-yr*:: <kld «fc !) 40 2 <n>$mx-TG AT-^5 4 2 3 T 5 7 ®S (g2?U#-*§- : 

ADjK-g-7'7^5 K^riSS'JLfco *f#£E&<JfS2f1U88!-S-ft- 6) ^ W-?^K£r=i — K"t"5 1 5 0 0 % 9 \^^f~ Kal— 

5r^1--5#^WADSi-a-7'7^ 5 Kfi. CCY3 2 1 7*V y -7 W ^^7 U'— JxtfTjk&hti. 

Sr^K^-TSfcft^ffl Lfco ao*><73?KK$K^fls:|j: ^=7^ K»i, B?>J#» : 505-2 8*SS*ftrS 

#7^ h-^ft#ttLEU 2feiVL a c ZgM^ t, ty^^l/tfK5' — GTCTCTGGGCC 

* y-^^^Jw^L-fc. nWmtf&m&)X'*)Z>Z. CCGGTTGTCTGTTG — 3' (Sa?lJS^ : 1 
«S§&-t-5fcfefCl, =S-AD)K-g-7'7^ 5 K«2 2 W^gg 6) SrfflV-CHiS Lfc ; 7-< 7*7 V -fc±^ ^ y — = 

ttL e x Am&fi^'<?Mtm3:{tmi-z>mfi&&.%iL y^/Dh^-zn^iDiv^ ^^y-^x^w^- 

fc. ^^.y-:=V^«>JB^:9.fi-VK«raiaU LexA 7 9 ^ KT-fi, UfcgaJlJS^- : 7ttSg<73 

- P S - 1 a^i*fSWlC«S^f5 ADS^7*7 cDNA^o-^^ft<5, E5iJ#^ : 7 OffiJiJ^ 
^$K«-^Lfc. 50 ICJ;?), gS3?iJ#-^ : 7C32 5 KOffiB-CATGT'^* 
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9, 1 6 2 7 ^ffit-CT G A 545875^8 
(i2#J#-g-:8) * W^fCSr^- K-f£ 1 3 7 

— (Marathon Ready) J t h/HcDNA (^n^y 
^) *3<tU : 7 P ^^"^— (O^* h • iry b (nested set) 
SrJHwe, 5' ^U'^KEWfcS' RACEtM 

ga?u##: imwk^y^ ^-5' -CCAA 

CAGACAACCGGGCCCAGAGACT— 3 ' 

(EOT**: 2 0) *5<tl>*5 f 7^-y7^*7 1 

(^n^xy^) Sr. «W^PCR#«fcttfflU E3*J 
74*SM^9-r-7-5* -TGCCTCCTCG 
CCCGCCCTACTCAGA-3' (EM**: 2 
1) 7^-^7^v 2 (^pyf^ 

ii P cR2. i wyf hp-yy) kt/a^p-v 

Iklt 5Fjc36^' (staggered) 5' *»ttt8 1 8 

*^EW«r*lSfc (EM«*:'2 2) 0 av-fcv-p-^E 
M (EJIItt : 2 3) Sr^lSSfcftK, EMS*: 2 2 
S3J:t«e5««» : -7«:»W{b+* ^^vtf K02 
2 scofitgic^ Artf£ih=i K^fctK 
^V'J&SEWS-S- : 7^«**t5tt»C*tJES"*-5ri: 

[0 0 6 61 Jftfi-T-Oa VfeV^^E^JO^ftttPS P 
1-1 (EW*t: 2 4*5*0*2 5) , PS P 1-3 

(EM#»: 2 6*5j:t52 7) *3itfPSPl-4 (E 
9IJ#*: 28*3^^29) i$fcU 5' aytyf^E 
M (E?»J#^- : 22) , *©»»»OSP 1 ^o — >\ 

ft&Vlc:#*EyiJ#-©" : 7, 3*5±a c 5«r#5SEyWfc»c 
-tO* tfc. PSP1-1 (E #J#* : 2 5 ) (DffimT 

• n!i (E.coli) h t r A (E^J#-§- : 
14) ^^SE^Httf:. BESTFITT/V^yXA 
^^^^v/V^c^v^i^-r^ y ^ ^ • 3^^-?- * 

OMHtt*5i:t«3.5%OiaHtdMM£ft. Bl (± 
«S, P S P 1 - 1 ; T». -Y — • a y (E.coli) h t 
rA) ld«f Q ±T«)-fey 

Six*. Bi|Lt^f5;>'*3j;t;ty> j £-f-7Gxsx 

GI4, P S P 1 - 1 <D&* 1 9 8^<tt>*3 0 4 - 3 0 8 

[0067] *cW7'?V '> 3 ^ (gap creatio 
n) fcitfx^^f ^v / 3V"<t^r^- (extension 

penalties) *S*/r5. 0*5i W. 3T?fc5, PILEUP 
fcit/PRETT YT/l-^V Xis (£ -< * 3 
v^^v^ • • SrJf)^ 

tPSPl-2, PSPl-l, PSPl-3^<t(;P 
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SPl-4^^ U;*-^ KEMtfctfcttBI 2-10 Id* 

f o aEWfbote*. g?ij<£>* * k i s 4 i <o#e 

-C\ P S P 1 -2*3J:tfPS P 1 - 1142 2 5ttfi*t*r 
4«fcU PSP l-4f41 9 5&&tt*!Kik~tZ>Zbtf 
**ixfc 0 ISI— <E>gJiJrt<0 i 9 4 2<dx? KffiB 
"C\ PSP1-4 f4.P SP1-2, PSP1-1 tSXXf 
P S P 1 - 3 fcfiqSE-j-* 9 6ttS#'4r'£:^-c^5 0 
ffe»ffi0>af£*wc»4* ^^9^^*Mfe=v-fevi»-^EW 

AGG4fc(4TGG U^T^I") rtL<b<7}£ 
10 (alternate forms) ttftSx-W** v'V 

/tB5 t^^fcS w i ftSift ItV^S, Mount, S. , Nu 
cl. Acids. Res. 10:458-472 (1982) $r#J8<Owi: Q 15 
4 1 0{t«-COa*Wt<7)^7'7W i/^^t-ct 0 v PSP 
1-3 KfiqSEi-SfMia K>" (12-10 X'T&Ztf 
-*-) #0**3 tt*. *6Jw % PSPl-2fc±VPSP 
1-1 f4gjij<£> 6 7 2 cOffigT-*— <Dji $ K^ffi 
ft*-*. PS PI- 2(4. v-^tM V£r^ — Kt5TG 
C=r KV£f£ST£^ffigT-^U -*PSP1- 
Ittf*-^ = — KtSCGCa K^Srf^5CSrw 
20 Offltt-e-MT-J-a. PS 1-1 (EM**: 2 4) <OJt 
ia-e5lfflUfcOhnofeO*S3et hir y ^077- fef (E 
?|J#4§- : 1 7) u;*-?- KEWtttM-* 9 , 

■GAP7;Vrf!/XA*fflV^i4 9%<D|a-tt^ BE 
STF I TTA-=ry X^Sr/g^Si: 6 5%£)|^— ttris*. 
£ixfc *Mti5*LTV*fcV*) 0 PSP 1-1 (EW 

#»: 2 5) ©B*7?y|tEJlJ«), ±f2T*§lfflLfc0h 
nobOD 8 7 2 5 8 7nf7-* (EMtt : 1 8) 
»i-*«?»Mb«4, BESTFIT7^yXASrfflV^ 
SUfiSc I, @1 1 (±«B, P S P 1 - 1 ; TfflS, Ohnot>^ 
30 D8 7 2 5 87 P Pf7-f) fd^-fo 

*5 4 6%iora-tti*«*Sixfc. 

[0 0 6 8] H^^J 3 - P S P 1 Offlltt^* 

5f>4>*f«r*i*L-C, fc HUHrtOPSP lmRN 
AO»#«:JBJgLfc. PSPlEJilC#Lt»fit8 3 
0tKSd-y 1^*9" K^o-r^SrttfflUfc (5*- 

ATGCTGAACAT CGGGAAAGC TTGGT 
TCTCG-3') (EW#*: 1 9) . 
«(^nyyfy^# 7 7 5 0-l, #7 7 6 0-1, 
ftitf# 7 7 5 5-l) ^b^mRNASr^tt^^- 

jg, mm. jbk. «ii£T^ pj 

«a*jJ:tJtK»w±TO««-e«lilSixfc. PSPim 

[0 0 6 9] H1KW4 - P S P 1 ^Stt^>^ttl 
PSPlty^^.ytf K1AFC, lAFTfeitf 
lARtt, Arg?:Cy s C7 ^ J *SE{bS*S ^ ^ ^ 
50 K6 7 2 V) "C^^SfttSr^ttl 
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(ASO) 1 AFC^t^l AFTI*. rftfefl) 

1AFC:CAT CCG GCA TTG TTA 
GCT CTG C22^;U (EM#t : 3 2) 
1 A F T : CAT CCG GCA TTG TTA 
GCT CTG T 2 2 * ^ (E^J#* : 33) 
1 A R : C A A TAG CTG CAT CAG T 
T.T G A A TG 2 3*/^ (E?IJ## : 3 4) 
*y =Oc^ K» (1AFC+1AR, itttlA 

FT+1AR) SrJilT^frTt'PCRCffifflLt: 9 
4tt'4 0^ 6 0"Ct3 0#\ lUKlenTaql 
y ^ • !/Sf y K) * 5 0 mM Tr is-C 
1 pH9.U 16mMM7^^, 3.5mMM 
gCl a , 1 5 0 it g m 1 _l B S A:teJ:tfS*tt*j3WD 

t hyy a 2 5 n gSr-g-^r-rs— e^sjct3 sim*^ 

/K 0 KO#»liyyADNA0 12 

^/v (^3-BRL) Lfco 9 5SS*tC0 

^(cSSi^fr-Cfea-irSr^i-i 2<@cDDNA£>5*> 
8fflJd*3«t5iai*^ASO-eB*&nfco lAFcty 

=*X* KO*-C2fflODNA«ri««U 

fc5 0 1 AFT* !) rf5t^ l^tf KO*-C2fiODNA 

TraaS5-&flcc*a« pspity^^utf kib 

FC, 1 BFTfc±ai BRtt, A 1 aSrVa 1 KIT'S 
y»8Eft:**6^^ ^*^K1 4 3 5 (S^^^b*- 

1BFC :TGG CGG GCT TTG GGG 
GGC ATT C 2 2^/U (E?lJ#-^* : 3 5) 
1BFT:TGG CGG GCT TTG GGG 
GGC ATT T 2 2>yU (EJtJ#-^ : 3 6) 
1 B R : G AC GTC AGC AGG GCC C 
GG AGG TC 2 3*/^ (E^J#* : 3 7) 
*y=r^^U*^-K*t (1BFC+1BR, SfcltlB 
FT+1BR) *«TO*«=T"CPCRlc:ffifflLfc : 9' 
4tt4 0^ 6 7tt'3 0^ lUKlenTaql 
y ^ • y K) , 5 0 mM T r i s - C 

1 pH9.1 % 16mMlSS7^^ v 3.5mM M* 
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*gCU, 1 5 0 m gm 1 ~ l B S A*3ir/SjtE|i**0<O 

/K • KO*»(iy/ ADNA(D 1 2 

y/U (^n-BRL) TMMttMttLfc. 7 5*Sg*tCQ 

3te5#fic3ie-¥'Sr=fir-f 51 k Sr^t"l 2j@<7?-^V7 p /K^ 
?*>9<@T\ 1 BFT ASO«rfflV*T»tt&ftfc. 
[00 7 0] Hifi^J 5- D87258 g>m&<D&tti 
10 ty^^^f K2AFC, 2 AFTfc±(J J 2 AR 
tt, Gl ySrVa 1 ICTS ✓tt»fc**6**V*^K 

1 3 2 5 (^T^V^f^^S^) 

2 AF G : GAT ACC CCA GCA G A A 
GCT GG2 0^/V (E?"J#^- : 3 8) 

2 A F T : GAT ACC CCA GCA G A A 
GCT GT2 0^^ (E?B#* : 3 9 ) 
2 AR : G C T GAC AT C ATT GGC G 
GA G AC'S 1 (E5U#* : 4 0) 

20 *y =**^ K*t (2 AFC + 2 AR, ifcl*2A 

FT+2AR) *rKT^>*fl=T-ePCRlC«fflLfc : 9 
4°C-C40S\ 6 2tt3 0&\ lUKlenTaql 
(v?W*;y ^ "DSfyK) „ 5 OmM Tr is-C 
1 pH9.U 16mMSS7y^!)^ 3.5mMM 
gCl 2i 1 5 0 m gm 1 " l B S A*5it;fi*f4*fcO 
thyM25n g (SOKJC-e 3 5 f--f ^ 

/K 0 *y ziyi 9 l/tf K«)*#ttyyADNAfl)l 2 
Oy^^ty^uKttLtKRL, S4««r4%*35 
y/l" (^3-BRL) T-®M»»]Lfc 0 2 AFT A 

30 SOIiftl 0 0 0*S»O-o©/<yK*4Cfc 0 

/<y K^aE-tt*y K2 AR*3 itf2 AF 

Tld* 9 y.y >*ls?£tlZ>W&&'( >hv ><Dft&\C 
B4t>^-e*)5 0 2 AFTT*ti*ILfc^>7 p yU<^^:T(c 



40 H£^^%tr^#T-fc<5o 
[0 0 7 1 ] 
[EW*1 
E^iJS^- : 1 

GGGACTCCCC CAAACCAATG TGGAATACAT TCAAACTGAT GCAGCTATTG ATTTTGGAAA 
CTCTGGAGGT CCCCTGGTTA ACCTGGATGG GGAGGTGATT GGAGTGAACA CCATGAAGGT 
CACAGCTGGA ATCTCCTTTG CCATCCCTTC TGATCGTCTT CGAGAGTTTC TGCATCGTGG 
GGAAAAGAAG AATTCCTCCT CCGGAATCAG TGGGTCCCAG CGGCGCTACA TTGGGGTGAT 
GATGCTGACC CTGAGTCCCA GCATCCTTGC TGAACTACAG CTTCGAGAAC CAAGCTTTCC 
CGATGTTCAG CATGGTGTAC TCATCCATAA AGTCATCCTG GGCTCCCCTG CACACCGGGC 
TGGTCTGCGG CCTGGTGATG TGATTTTGGC CATTGGGGAG CAGATGCTAC AAAATGCTGA 



60 
120 
180 
240 
300 
360 
420 
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AGATGTTTAT GAAGCTGTTC GAACCCAATC CCAGTTGGCA GTGCAGATCC GGCGGGGACG 480 

AGAAACACTG ACCTTATATG TGACCCCTGA GGTCACAGAA TGAATAGATC ACCAAGAGTA 540 

TGAGGCTCCT GCTCTGATTT CCTCaTGCC TTTCTGGCTG AGGTTCTGAG GGCACCGAGA 600 

CAGAGGGTTA AATGAACCAG TGGGGGCAGG TCCCTCCAAC CACCAGCACT GACTCCTGGG 660 

CTCTGAAGAA TCACAGAAAC ACTTTTTATA TAAAATAAAA TTATACCTAG CAACAAAAAA 720 

AAAAAAAAAA AA 732 

BB?'j#^ : 2 

Gly Leu Pro Gin Thr Asn Val Glu Tyr He Gin Thr Asp Ala Ala lie 

15 10 15 

Asp Phe Gly Asn Ser Gly Gly Pro Leu Val Asn Leu Asp Gly Glu Val 

20 25 30 

He Gly Val Asn Thr Met Lys Val Thr Ala Gly lie Ser Phe Ala lie 

35 40 45 

Pro Ser Asp Arg Leu Arg Glu Phe Leu His Arg Gly Glu Lys Lys Asn 

50 55 60 

Ser Ser Ser Gly lie Ser Gly Ser Gin Arg Arg Tyr He Gly Val Met 
65 70 75 80 

Met Leu Thr Leu Ser Pro Ser He Leu Ala Glu Leu Gin Leu Arg Glu 

85 90 95 

Pro Ser Phe Pro Asp Val Gin His Gly Val Leu He His Lys Val He 

100 105 110 

Leu Gly Ser Pro Ala His Arg Ala Gly Leu Arg Pro Gly Asp Val He 

115 120 125 

Leu Ala He Gly Glu Gin Met Val Gin Asn Ala Glu Asp Val Tyr Glu 

130 135 140 

Ala Val Arg Thr Gin Ser Gin Leu Ala Val Gin He Arg Arg Gly Arg 
145 150 155 160 

Glu Thr Leu Thr Leu Tyr Val Thr Pro Glu Val Thr Glu 
165 170 

mm%- : 3 

CCCAGTCTCT GGGCCCGGTT GTCTGTTGGG "CTCACTGAAC CCCGAGCATG CCTGACGTCT 60 
GGGACCCCGG GTCCCCGGGC ACAACTGACT GCGGTGACCC CAGATACCAG GACCCGGGAG 120 
GCCTCAGAGA ACTCTGGAAC CCGTTCGCGC GCGTGGCTGG CGGTGGCGCT GGGCGCTGGG 180 
GGGGCAGTGC TGTTGTTGTT GTGGGGCGGG GGTCGGGGTC CTCCGCCCGT CCTCGCCGCC 240 
GTCCCTAGCC CGCCGCCCGC TTCTCCCCGG AGTCAGTACA ACTTCATCGC AGATGTGGTG 300 
GAGAAGACAG CACCTGCCGT GGTCTATATC GAGATCCTGG ACCGGCACCC TTTCTTGGGC 360 
CGCGAGGTCC CTATCTCGAA CGGCTCAGGA TTCGTGGTGG CTGCCGATGG GCTCATTGTC 420 

ACCAACGCCC ATGTGGTGGC TGATCGGC"Gi;"AGTOTCCGTG TGAGArrGCrraGCGCCGAC 480 

ACGTATGAGG CCGTGGTCAC AGCTGTGGAT CCCGTGGCAG ACATCGCAAC GCTGAGGATT 540 
CAGACTAAGG AGCCTCTCCC CACGCTGCCT CTGGGACGCT CAGCTGATGT CCGGCAAGGG 600 
GAGTTTGTTG TTGCCATGGG AAGTCCCTTT GCACTGCAGA ACACGATCAC ATCCGGCATT 660 
GTTAGCTCTG CTCAGCGTCC AGCCAGAGAC CTGCGACTCC CCCAAACCAA TGTGGAATAC 720 
ATTCAAACTG ATGCAGCTAT TGATTTTGGA AACTCTGGAG GTCCCCTGGT TAACCTGGTG 780 
AGTGAGACAT CCTTCCTTCC AAGAATCCCT GCCCCAGGTC AGTGTGGGAA GGGTAGGTTT 840 
CCCCTAATTC AAGGATGTTT GGTCAAGTTT CTGAGCAGTT CTTTGTTGGC TATCTCTCAA 900 
TATCCAACCA GATCTCCCCA ACACTTGCTG GTACTTTTGT TCGGGTGCCC CCATCCCCTA 960 
CTATTTGTTT AGGCTAGGGA ACTGGGGGCT GTATCCCTGC AGGATGGGGA GGTGATTGGA 1020 
GTGAACACCA TGAAGGTCAC AGCTGGAATC TCCTTTGCCA TCCCTTCTGA TCGTCTTCGA 1080 
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GAGTTTCTGC ATCGTGGGGA AAAGAAGAAT TCCTCCTCCG GAATCAGTGG GTCCCAGCGG 1140 

CGCTACATTG GGGTGATGAT GCTGACCCTG AGTCCCAGCA TCCTTGCTGA ACTACAGCTT 1200 

CGAGAACCAA GCTTTCCCGA TGTTCAGCAT GGTGTACTCA TCCATAAAGT CATCCTGGCC 1260 

TCCCCTGCAC ACCGGGCTGG TCTGCGGCCT GGTGATGTGA TTTTGGCCAT TGGGGAGCAG 1320 

ATGGTACAAA ATGCTGAAGA TGTTTATGAA GCTGTTCGAA CCCAATCCCA GTTGGCAGTG 1380 

CAGATCCGGC GGGGACGAGA AACACTGACC TTATATGTGA CCCCTGAGGT CACAGAATGA 1440 

ATAGATCACC AAGAGTATGA GGCTCCTGCT CTGATTTCCT CCTTGCCTTT CTGGCTGAGG 1500 

TTCTGAGGGC ACCGAGACAG AGGGTTAAAT GAACCAGTGG GGGCAGGTCC CTCCAACCAC 1560 

CAGCACTGAC TCCTGGGCTC TGAAGAATCA CAGAAACACT TTTTATATAA AATAAAATTA 1620 

TACCTAGCAA CATATTATAG TAAAAAATGA GGTGGGAGGG CTGGATCTTT TCCCCCACCA 1680 

AAAGGCTAGA GGTAAAGCTG TATCCCCCTA AACTTAGGGG AGATACTGGA GCTGACCATC 1740 

CTGACCTCCT ATTAAAGAAA ATGAGCTGCT GAAAAAAAAA AAAAAAA 1787 

e*j»* : 4 

Pro Ser Leu Trp Ala Arg Leu Ser Val Gly Val Thr Glu Pro Arg Ala 

15 10 15 

Cys Leu Thr Ser Gly Thr Pro Gly Pro Arg Ala Gin Leu Thr Ala Val 

20 25 30 

Thr Pro Asp thr Arg Thr Arg Glu Ala Ser Glu Asn Ser Gly Thr Arg 

35 40 45 

Ser Arg Ala Trp Leu Ala Val Ala Leu Gly Ala Gly Gly Ala Val Leu 

50 55 60 

Leu Leu Leu Trp Gly Gly Gly Arg Gly Pro Pro Ala Val Leu Ala Ala 
65 70 75 80 

Val Pro Ser Pro Pro Pro Ala Ser Pro Arg Ser Gin Tyr Asn Phe lie 

85 90 95 

Ala Asp Val Val Glu Lys Thr Ala Pro Ala Val Val Tyr He Glu He 

100 105 110 

Leu Asp Arg His Pro Phe Leu Gly Arg Glu Val Pro He Ser Asn Gly 

115 120 125 

Ser Gly Phe Val Val Ala Ala Asp Gly Leu He Val Thr Asn Ala His 

130 135 140 

Val Val Ala Asp Arg Arg Arg Val "Xrg Val Arg Leu Leu Ser Gly Asp 
145 150 155 160 

Thr Tyr Glu Ala Val Val Thr Ala Val Asp Pro Val Ala Asp lie Ala 

165 170 175 

Thr Leu Arg lie Gin Thr Lys Glu Pro Leu Pro Thr Leu Pro Leu Gly 
180 185 190 

— Arg--Ser-^la^sp-V*l-Ar-g"^ 

195 200 205 

Pro Phe Ala Leu Gin Asn Thr He Thr Ser Gly He Val Ser Ser Ala 

210 215 220 

Gin Arg Pro Ala Arg Asp Leu Gly Leu Pro Gin Thr Asn Val Glu Tyr 
225 230 235 240 

lie Gin Thr Asp Ala Ala He Asp Phe Gly Asn Ser Gly Gly Pro Leu 

245 250 255 

Val Asn Leu Val Ser Glu Thr Ser Phe Leu Pro Arg He Pro Ala Pro 

260 265 270 

Gly Gin Cys Gly Lys Gly Arg Phe Pro Leu lie Gin Gly Cys Leu Val 
275 280 285 
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37 38 

Lys Phe Leu Ser Ser Ser Leu Leu Ala He Ser Gin Tyr Pro Thr Arg 

290 295 ' 300 

Ser Pro Gin. His Leu Leu Val Leu Leu Phe Gly Cys Pro His Pro Leu 
305 310 315 320 

Leu Phe Val 

: 5 

CTTCGGGCAT GGCGGGCTTT GGGGGGCATT CGCTGGGGGA GGAGACCCCG TTTGACCCCT 60 

GACCTCCGGG CCCTGCTGAC GTCAGCAACT TaGACCCCC GGGCCCGAGT GACTTATGGG 120 

ACCCCCAGTC TCTGGGCCCG GTTGTCTGTT GGGGTCACTG AACCCCGAGC ATGCCTGACG 180 

TCTGGGACCC CGGGTCCCCG GGCACAACTG ACTGCGGTGA CCCCAGATAC CAGGACCCGG 240 

GAGGCCTCAG AGAACTCTGG AACCCGTTCG CGCGCGTGGC TGGCGGTGGC GCTGGGCGCT 300 

GGGGGGGCAG TGCTGTTGTT GTTGTGGGGC GGGGGTCGGG GTCCTCCGGC CGTCCTCGCC 360 

GCCGTCCCTA GCCCGCCGCC CGCTTCTCCC CGGAGTCAGT ACAACTTCAT CGCAGATGTG 420 

GTGGAGAAGA CAGCACCTGC CGTGGTCTAT ATCGAGATCC TGGACCGGCA CCCTTTCTTG 480 

GGCCGCGAGG TCCCTATCTC GAACGGCTCA GGATTCGTCG TGGCTGCCGA TGGGCTCATT 540 

GTCACCAACG CCCATGTGGT GGCTGATCGG CGCAGAGTCC GTGTGAGACT GCTAAGCGGC 600 

GACACGTATG AGGCCGTGGT CACAGCTGTG GATCCCGTGG CAGACATCGC AACGCTGAGG 660 

ATTCAGACTA AGGAGCCTCT CCCCACGCTG CCTCTGGGAC GCTCAGCTGA-TGTCCGGCAA - 720 

GGGGAGTTTG TTGTTGCCAT GGGAAGTCCC TTTGCACTGC AGAACACGAT CACATCCGGC 780 

ATTGTTAGCT CTGCTCAGCG TCCAGCCAGA GACCTGGGAC TCCCCCAAAC CAATGTGGAA 840 

TACATTCAAA CTGATGCAGC TATTGATTTT GGAAACTCTG GAGGTCCCCT GGTTAACCTG 900 

GCTAGGGAAC TGGGGGCTGT ATCCCTGCAG GATGGGGAGG TGATTGGAGT GAACACCATG 960 

AAGGTCACAG CTGGAATCTC CTTTGCCATC CCTTCTGATC GTCTTCGAGA GTTTCTGCAT 1020 

CGTGGGGAAA AGAAGAATTC CTCCTCCGGA ATCAGTGGGT CCCAGCGGCG CTACATTGGG 1080 

GTGATGATGC TGACCCTGAG TCCCAGGGCT GGTCTGCGGC CTGGTGATGT GATTTTGGCC 1140 

ATTGGGGAGC AGATGGTACA AAATGCTGAA GATGTTTATG AAGCTGTTCG AACCCAATCC 1200 

CAGTTGGCAG TGCAGATCCG GCGGGGACGA GAAACACTGA CCTTATATGT GACCCCTGAG 1260 

GTCACAGAAT GAATAGATCA CCAAGAGTAT GAGGCTCCTG CTCTGATTTC CTCCTTGCCT 1320 

TTCTGGCTGA GGTTCTGAGG GCACCGAGAC AGAGGGTTAA ATGAACCAGT GGGGGCAGGT 1380 

CCCTCCAACC ACCAGCACTG ACTCCTGGGC TCTGAAGAAT CACAGAAACA CTTTTTATAT 1440 

AAAATAAAAT TATACCTAGC AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1500 

AAA ** 1503 

E5>l#* : 6 

Leu Arg Ala Trp Arg Ala Leu Gly Gly He Arg Trp Gly Arg Arg Pro 

1 5 10 15 

Arg Leu Thr Pro Asp Leu Arg Ala Leu Leu Thr Ser Gly Thr Ser Asp 

20 25 30 

Pro Arg Ala Arg Val Thr Tyr Gly Thr Pro Ser Leu Trp Ala Arg Leu 

35 40 45 

Ser Val Gly Val Thr Glu Pro Arg Ala Cys Leu Thr Ser Gly Thr Pro 

50 55 60 

Gly Pro Arg Ala Gin Leu Thr Ala Val Thr Pro Asp Thr Arg Thr Arg 
65 70 75 80 

Glu Ala Ser Glu Asn Ser Gly Thr Arg Ser Arg Ala Trp Leu Ala Val 

85 90 95 

.Ala Leu Gly Ala Gly Gly Ala Val Leu Leu Leu Leu Trp Gly Gly Gly 

100 105 HO 

Arg Gly Pro Pro Ala Val Leu Ala Ala Val Pro Ser P 
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39 40 
Met Val Gin Asn Ala Glu Asp Val Tyr Glu Ala Val Arg Thr Gin Ser 
385 390 395 400 

Gin Leu Ala Val Gin He Arg Arg Gly Arg Glu Thr Leu Thr Leu Tyr 

405 410 415 

Val Thr Pro Glu Val Thr Glu 
420 

10 0 7 7] 

GGCCGGAAGG GCTAGCGGTC CCAGCATACC CCGCGGCCCC TTGCGCCGTC TCACAACTCG 60 
CGTCCGGCGG AGACCACAAT TCCCGGCATT CGTGGGGCAT GGAGGAGTCG GCCTCCCGGA 120 
ATCCTGGTCC CGGCGTGCAC TTCTGAAGGA CTTCAGGTAC CGGCGTGCCC CGCGTCCTAC 180 
TGTCCGCCTG CTCGCGTCCT GGGTGCCGCC TCTGAGTAGG GCGGGCGAGG AGGCAGCCAA 240 
GGCGGAGCTG ATG GCT GCG CCG AGG GCG GGG CGG GGT GCA GGC TGG AGC 289 
Met Ala Ala Pro Arg Ala Gly Arg Gly Ala Gly Trp Ser 
1 5 10 

CTT CGG GCA TGG CGG GCT TTG GGG GGC ATT TGC TGG GGG AGG AGA CCC 337 
Leu Arg Ala Trp Arg Ala Leu Gly Gly He Cys Trp Gly Arg Arg Pro 

15 20 25 

CGT TTG ACC CCT GAC CTC CGG GCC CTG CTG ACG TCA GGA ACT TCT GAC 385 
Arg Leu Thr Pro Asp Leu Arg Ala Leu Leu Thr Ser Gly Thr Ser Asp 
30 35 40 45 

CCC CGG GCC CGA GTG ACT TAT GGG ACC CCC AGT CTC TGG GCC CGG TTG 433 
Pro Arg Ala Arg Val Thr Tyr Gly Thr Pro Ser Leu Trp Ala Arg Leu 

50 55 60 

TCT GTT GGG GTC ACT GAA CCC CGA GCA TGC CTG ACG TCT GGG ACC CCG 481 
Ser Val Gly Val Thr Glu Pro Arg Ala Cys Leu Thr Ser Gly Thr Pro 

65 70 75 

GGT CCC CGG GCA CAA CTG ACT GCG GTG ACC CCA GAT ACC AGG ACC CGG 529 
Gly Pro Arg Ala Gin Leu Thr Ala Val Thr Pro Asp Thr Arg Thr Arg 

80 85 90 

GAG GCC TCA GAG AAC TCT GGA ACC CGT TCG CGC GCG TGG CTG GCG GTG 577 
Glu Ala Ser Glu Asn Ser Gly Thr Arg Ser Arg Ala Trp Leu Ala Val 

95 100 * 105 

GCG CTG GCC GCT GGG GGG GCA GTG CTG TTG TTG TTG TGG GGC GGG GGT 625 
Ala Leu Gly Ala Gly Gly Ala Val Leu Leu Leu Leu Trp Gly Gly Gly 
110 115 120 125 

CGG GGT CCT CCG GCC GTC CTC GCC GCC GTC CCT AGC CCG CCG CCC GCT 673 
Arg Gly Pro Pro Ala Val Leu Ala Ala Val Pro Ser Pro Pro Pro Ala 

130 ' 135 140 

TCT CCC CGG AGT CAG TAC AAC TTC ATC GCA GAT GTG GTG GAG AAG ACA 721 
Ser Pro Arg Ser Gin Tyr Asn Phe He Ala Asp Val Val Glu Lys Thr 

145 150 155 

GCA CCT GCC GTG GTC TAT ATC GAG ATC CTG GAC CGG CAC CCT TTC TTG 769 
Ala Pro Ala Val Val Tyr He Glu He Leu Asp Arg His Pro Phe Leu 

160 165 170 

GGC CGC GAG GTC CCT ATC TCG AAC GGC TCA GGA TTC GTG GTG GCT GCC 817 
Gly Arg Glu Val Pro lie Ser Asn Gly Ser Gly Phe Val Val Ala Ala 

175 180 185 

GAT GGG CTC ATT GTC ACC AAC GCC CAT GTG GTG GCT GAT CGG CGC AGA 865 
Asp Gly Leu He Val Thr Asn Ala His Val Val Ala Asp Arg Arg Arg 
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41 42 
190 195 200 205 

GTC CGT GTG AGA CTG CTA AGC GGC GAC ACG TAT GAG GCC GTG GTC ACA 913 
Val Arg Val Arg Leu Leu Ser Gly Asp Thr Tyr Glu Ala Val Val Thr 

210 215 220 

GCT GTG GAT CCC GTG GCA GAC ATC GCA ACG CTG AGG ATT CAG ACT AAG 961 
Ala Val Asp Pro Val Ala Asp lie Ala Thr Leu Arg He Gin Thr Lys 

225 230 235 

GAG CCT CTC CCC ACG CTG CCT CTG GGA CGC TCA GCT GAT GTC CGG CAA 1009 
Glu Pro Leu Pro Thr Leu Pro Leu Gly Arg Ser Ala Asp Val Arg Gin 

240 245 250 

GGG GAG TTT GTT GTT GCC ATG GGA AGT CCC TTT GCA CTG CAG AAC ACG 1057 
Gly Glu Phe Val Val Ala Met Gly Ser Pro Phe Ala Leu Gin Asn Thr 

255 260 265 

ATC ACA TCC GGC ATT GTT AGC TCT GCT CAG CGT CCA GCC AGA GAC CTG 1105 
He Thr Ser Gly He Val Ser Ser Ala Gin Arg Pro Ala Arg Asp Leu 
270 275 280 285 

GGA CTC CCC CAA ACC AAT GTG GAA TAC ATT CAA ACT GAT GCA GCT ATT 1153 
Gly Leu Pro Gin Thr Asn Val Glu Tyr He Gin Thr Asp Ala Ala lie 

290 295 ~ 300 

GAT TTT GGA AAC TCT GGA GGT CCC CTG GTT AAC CTG GAT GGG GAG GTG 1201 
Asp Phe Gly Asn Ser Gly Gly Pro Leu Val Asn Leu Asp Gly Glu Val 

305 310 315 

ATT GGA GTG AAC ACC ATG AAG GTC ACA GCT GGA ATC TCC TTT GCC ATC 1249 
lie Gly Val Asn Thr Met Lys Val Thr Ala Gly lie Ser Phe Ala He 

320 325 330 

CCT TCT GAT CGT CTT CGA GAG TTT CTG CAT CGT GGG GAA AAG AAG AAT 1297 
Pro Ser Asp Arg Leu Arg Glu Phe Leu His Arg Gly Glu Lys Lys Asn 

335 340 345 

TCC TCC TCC GGA ATC AGT GGG TCC CAG CGG CGC TAC ATT GGG GTG ATG 1345 
Ser Ser Ser Gly He Ser Gly Ser Gin Arg Arg Tyr He Gly Val Met 
350 355 360 365 

ATG CTG ACC CTG AGT CCC AGC ATC CTT GCT GAA CTA CAG CTT CGA GAA 1393 
Met Leu Thr Leu Ser Pro Ser He leu Ala Glu Leu Gin Leu Arg Glu 

370 375 380 

CCA AGC TTT CCC GAT GTT CAG CAT GGT GTA CTC ATC CAT AAA GTC ATC 1441 
Pro Ser Phe Pro Asp Val Gin His Gly Val Leu lie His Lys Val lie 

385 390 395 

CTG GGC TCC CCT GCA CAC CGG GCT GGT CTG CGG CCT GGT GAT GTG ATT 1489 
Leu Gly Ser Pro Ala His Arg Ala Gly Leu' Arg Pro Gly Asp Val He 

400 405 410 

TTG GCC ATT GGG GAG CAG ATG GTA CAA AAT GCT GAA GAT GTT TAT GAA 1537 
Leu Ala lie Gly Glu Gin Met Val Gin Asn Ala Glu Asp Val Tyr Glu 

415 420 425 

GCT GTT CGA ACC CAA TCC CAG TTG GCA GTG CAG ATC CGG CGG GGA CGA 1585 
Ala Val Arg Thr Gin Ser Gin Leu Ala Val Gin He Arg Arg Gly Arg 
430 435 440 445 

GAA ACA CTG ACC TTA TAT GTG ACC CCT GAG GTC ACA GAA TGAATAGATC ACC 1637 
Glu Thr Leu Thr Leu Tyr Val Thr Pro Glu Val Thr Glu 
450 455 
AAGAGTATGA GGCTCCTGCT CTGATTTCCT CCTTGCCTTT CTGCCTGAGG TTCTGAGGGC 1697 
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43 44 
ACCGAGACAG AGGGTTAAAT GAACCAGTGG GGGCAGGTCC CTCCAACCAC CAGCACTGAC 1757 
TCCTGGGCTC TGAAGAATCA CAGAAACACT TTTTATATAA AATAAAATTA TACCTAGCAA 1817 
CATAAAAAAA AAAAAAAA 1835 

mm^r : 8 

Met Ala Ala Pro Arg Ala Gly Arg Gly Ala Gly Trp Ser Leu Arg Ala 

15 10 15 

Trp Arg Ala Leu Gly Gly He Cys Trp Gly Arg Arg Pro Arg Leu Thr 

20 25 30 

Pro Asp Leu Arg Ala Leu Leu Thr Ser Gly Thr Ser Asp Pro Arg Ala 

35 40 45 

Arg Val Thr Tyr Gly Thr Pro Ser Leu Trp Ala Arg Leu Ser Val Gly 

50 55 60 

Val Thr Glu Pro Arg Ala Cys Leu Thr Ser Gly Thr Pro Gly Pro Arg 
65 70 75 80 

Ala Gin Leu Thr Ala Val Thr Pro Asp Thr Arg Thr Arg Glu Ala Ser 

85 90 95 

Glu Asn Ser Gly Thr Arg Ser Arg Ala Trp Leu Ala Val Ala Leu Gly 

100 105 110 

Ala Gly Gly Ala Val Leu Leu Leu Leu Trp Gly Gly Gly Arg Gly Pro 

115 120 125 

Pro Ala Val Leu Ala Ala Val Pro Ser Pro Pro Pro Ala Ser Pro Arg 

130 135 140 

Ser Gin Tyr Asn Phe lie Ala Asp Val Val Glu Lys Thr Ala Pro Ala 
145 150 155 160 

Val Val Tyr He Glu He Leu Asp Arg His Pro Phe Leu Gly Arg Glu 

165 170 175 

Val Pro lie Ser Asn Gly Ser Gly Phe Val Val Ala Ala Asp Gly Leu 

180 185 190 

He Val Thr Asn Ala His Val Val Ala Asp Arg Arg Arg Val Arg Val 

195 200 205 

Arg Leu Leu Ser Gly Asp Thr Tyr Glu Ala Val Val Thr Ala Val Asp 

210 215 ■» 220 

Pro Val Ala Asp He Ala Thr Leu Arg He Gin Thr Lys Glu Pro Leu 
225 230 235 240 

Pro Thr Leu Pro Leu Gly Arg Ser Ala Asp Val Arg Gin Gly Glu Phe 

245 250 255 

Val Val Ala Met Gly Ser Pro Phe Ala Leu Gin Asn Thr He Thr Ser 

260 - 265 270 

Gly He Val Ser Ser Ala Gin Arg Pro Ala Arg Asp Leu Gly Leu Pro 

275 280 285 

Gin Thr Asn Val Glu Tyr He Gin Thr Asp Ala Ala lie Asp Phe Gly 

290 295 300 

Asn Ser Gly Gly Pro Leu Val Asn Leu Asp Gly Glu Val He Gly Val 
305 310 315 320 

Asn Thr Met Lys Val Thr Ala Gly He Ser Phe Ala He Pro Ser Asp 

325 330 335 

Arg Leu Arg Glu Phe Leu His Arg Gly Glu Lys Lys Asn Ser Ser Ser 

340 345 350 

Gly He Ser Gly Ser Gin Arg Arg Tyr He Gly Val Met Met Leu Thr 
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45 46 
355 360 365 

Leu Ser Pro Ser lie Leu Ala Glu Leu Gin Leu Arg Glu Pro Ser Phe 

370 375 380 

Pro Asp Val Gin His Gly Val Leu He His Lys Val He Leu Gly Ser 
385 390 395 400 

Pro Ala His Arg Ala Gly Leu Arg Pro Gly Asp Val lie Leu Ala He 

405 410 415 

Gly Glu Gin Met Val Gin Asn Ala Glu Asp Val Tyr Glu Ala Val Arg 

420 425 430 

Thr Gin Ser Gin Leu Ala Val Gin He Arg Arg Gly Arg Glu Thr Leu 

435 440 445 

Thr Leu Tyr Val Thr Pro Glu Val Thr Glu 
450 455 

E*J#* : 9 

TGGGACAGGC AGCTCCGGGG TCCGCGGTTT CACATCGGAA ACAAAACAGC GGCTGGTCTG 60 

GAAGGAACCT GAGCTACGAG CCGCGGCGGC AGCGGGGCGG CGGGGAAGCG TATACCTAAT 120 

CTGGGAGCCT GCAAGTGACA ACAGCCTTTG CGGTCCTTAG ACAGCTTGGC CTGGAGGAGA 180 

ACACATGAAA GAAAGAACCT CAAGAGCCTT TGTTTTCTGT GAAACAGTAT TTCTATACAG 240 

TTGCTCCAAT GACAGAGTTA CCTGCACCGT TGTCGTACTT CCAGAATGCA CAGATGTCTG 300 

AGGACAACCA CCTGAGCAAT ACTGTACGTA GCCAGAATGA CAATAGAGAA CGGCAGGAGC 360 

ACAACGACAG ACGGAGCCTT GGCCACCCTG AGCCATTATC TAATGGACGA CCCCAGGGTA 420 

ACTCCCGGCA GGTGGTGGAG CAAGATGAGG AAGAAGATGA GGAGCTGACA TTGAAATATG 480 

GCGCCAAGCA TGTGATCATG CTCTTTGTCC CTGTGACTCT CTGCATGGTG GTGGTCGTGG 540 

CTACCATTAA GTCAGTCAGC TTTTATACCC GGAAGGATGG GCAGCTAATC TATACCCCAT 600 

TCACAGAAGA TACCGAGACT GTGGGCCAGA GAGCCCTGCA CTCAATTCTG AATGCTGCCA 660 

TCATGATCAG TGTCATTGTT GTCATGACTA TCCTCCTGGT GGTTCTGTAT AAATACAGGT 720 

GCTATAAGGT CATCCATGCC TGGCTTATTA TATCATCTCT ATTGTTGCTG TTCTTTTTTT 780 

CATTCATTTA CTTGGGGGAA GTGTTTAAAA CCTATAACGT TGCTGTGGAC TACATTACTG 840 

TTGCACTCCT GATCTGGAAT TTTGGTGTGG TGGGAATGAT TTCCATTCAC TGGAAAGGTC 900 

CACTTCGACT CCAGCAGGCA TATCTCATTA TGATTAGTGC CCTCATGGCC CTGGTGTTTA 960 

TCAAGTACCT CCCTGAATGG ACTGCGTGGC TCATCTTGGC TGTGATTTCA GTATATGATT 1020 

TAGTGGCTGT TTTGTGTCCG AAAGGTCCAC TTCGTATGCT GGTTGAAACA GCTCAGGAGA 1080 

GAAATGAAAC GCTTTTTCCA GCTCTCATTT ACTCCTCAAC AATGGTGTGG TTGGTGAATA 1140 

TGGCAGAAGG AGACCCGGAA GCTCAAAGGA GAGTATCCAA AAATTCCAAG TATAATGCAG 1200 

AAAGCACAGA AAGGGAGTCA CAAGACACTG TTGCAGAGAA TGATGATGGC GGGTTCAGTG 1260 

AGGAATGGGA AGCCCAGAGG GACAGTCATC TAGGGCCTCA TCGCTCTACA CCTGAGTCAC 1320 

GAGCTGCTGT CCAGGAACTT TCCAGCAGTA TCCTCGCTGG TGAAGACCCA GAGGAAAGGG 1380 

GAGTAAAACT TGGATTGGGA GATTTCATTT TCTACAGTGT TCTGGTTGGT AAAGCCTCAG 1440 

CAACAGCCAG T^AGACTGG AACACAACCA""TAGCCTGTTT "CGTAGCCATA "TTAATTGGTT 1500 

TGTGCCTTAC ATTATTACTC CTTGCCATTT TCAAGAAAGC ATTGCCAGCT CTTCCAATCT 1560 

CCATCACCTT TGGGCTTGTT TTCTACTTTG CCACAGATTA TCTTGTACAG CCTTTTATGG 1620 

ACCAATTAGC ATTCCATCAA TTTTATATCT AGCATATTTG CGGTTAGAAT CCCATGGATG 1680 

TTTCTTCTTT GACTATAACC AAATCTGGGG AGGACAAAGG TGATTTTCCT GTGTCCACAT 1740 

CTAACAAAGT CAAGATTCCC GGCTGGACTT TTGCAGCTTC CTTCCAAGTC TTCCTGACCA 1800 

CCTTGCACTA TTGGACTTTG GAAGGAGGTG CCTATAGAAA ACGATTTTGA ACATACTTCA 1860 

TCGCAGTGGA CTGTGTCCCT CGGTGCAGAA ACTACCAGAT TTGAGGGACG AGGTCAAGGA 1920 

GATATGATAG GCCCGGAAGT TGCTGTGCCC CATCAGCAGC TTGACGCGTG GTCACAGGAC 1980 

GATTTCACTG ACACTGCGAA CTCTCAGGAC TACCGGTTAC CAAGAGGTTA GGTGAAGTGG 2040 

TTTAAACCAA ACGGAACTCT TCATCTTAAA CTACACGTTG AAAATCAACC CAATAATTCT 2100 
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47 48 

GTATTAACTG AATTCTGAAC TTTTCAGGAG GTACTGTGAG GAAGAGCAGG CACCAGCAGC 2160 

AGAATGGGGA ATGGAGAGGT GGGCAGGGGT TCCAGCTTCC CTTTGATTTT TTGCTGCAGA 2220 

CTCATCCTTT TTAAATGAGA CTTGTTTTCC CCTCTCTTTG AGTCAAGTCA AATATGTAGA 2280 

TTGCCTTTGG CAATTCTTCT TCTCAAGCAC TGACACTCAT TACCGTCTGT GATTGCCATT 2340 

TCTTCCCAAG GCCAGTCTGA ACCTGAGGTT GCTTTATCCT AAAAGTTTTA ACCTCAGGTT 2400 

CCAAATTCAG TAAATTTTGG AAACAGTACA GCTATTTCTC ATCAATTCTC TATCATGTTG 2460 

AAGTCAAATT TGGATTTTCC ACCAAATTCT GAATTTGTAG ACATACTTGT ACGCTCACTT 2520 

GCCCCCAGAT GCCTCCTCTG TCCTCATTCT TCTCTCCCAC ACAAGCAGTC TTTTTCTACA 2580 

GCCAGTAAGG CAGCTCTGTC RTGGTAGCAG ATGGTCCCAT TATTCTAGGG TCTTACTCTT 2640 

TGTATGATGA AAAGAATGTG TTATGAATCG GTGCTGTCAG CCCTGCTGTC AGACCTTCTT 2700 

CCACAGCAAA TGAGATGTAT GCCCAAAGCG GTAGAATTAA AGAAGAGTAA AATGGCTGTT 2760 

GAAG 2764 

EJU#* : 1 0 

Met Thr Glu Leu Pro Ala Pro Leu Ser Tyr Phe Gin Asn Ala Gin Met 

1 5 10 15 

Ser Glu Asp Asn His Leu Ser Asn Thr Val Arg Ser Gin Asn Asp Asn 

20 25 30 

Arg Glu Arg Gin Glu His Asn Asp Arg Arg Ser Leu Gly His Pro Glu 

35 40 45 

Pro Leu Ser Asn Gly Arg Pro Gin Gly Asn Ser Arg Gin Val Val Glu 

50 55 60 

Gin Asp Glu Glu Glu Asp Glu Glu Leu Thr Leu Lys Tyr Gly Ala Lys 
65 70 75 80 

His Val He Met Leu Phe Val Pro Val Thr Leu Cys Met Val Val Val 

85 90 95 

Val Ala Thr He Lys Ser Val Ser Phe Tyr Thr Arg Lys Asp Gly Gin 

100 105 110 

Leu He Tyr Thr Pro Phe Thr Glu Asp Thr Glu Thr Val Gly Gin Arg 

115 120 125 

Ala Leu His Ser He Leu Asn Ala Ala He Met He Ser Val lie Val 

130 135 140 

Val Met Thr lie Leu Leu Val Val leu Tyr Lys Tyr Arg Cys Tyr Lys 
145 150 155 160 

Val lie His Ala Trp Leu He He Ser Ser Leu Leu Leu Leu Phe Phe 

165 170 175 

Phe Ser Phe lie Tyr Leu Gly Glu Val Phe Lys Thr Tyr Asn Val Ala 

180 185 190 

Val Asp Tyr lie -Thr Val Ala Leu- Leu He -Trp -Asn Phe- Gly Val Val. 

195 200 205 

Gly Met He Ser He His Trp Lys Gly Pro Leu Arg Leu Gin Gin Ala 

210 215 220 

Tyr Leu lie Met He Ser Ala Leu Met Ala Leu Val Phe He Lys Tyr 
225 230 235 240 

Leu Pro Glu Trp Thr Ala Trp Leu He Leu Ala Val He Ser Val Tyr 

245 250 255 

Asp Leu Val Ala Val Leu Cys Pro Lys Gly Pro Leu Arg Met Leu Val 

-260 — - - 265 270 

Glu Thr Ala Gin Glu Arg Asn Glu Thr Leu Phe Pro Ala Leu lie Tyr 
275 280 285 
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49 50 
Ser Ser Thr Met Val Trp Leu Val Asn Met Ala Glu Gly Asp Pro Glu 

290 295 300 

Ala Gin Arg Arg Val Ser Lys Asn Ser Lys Tyr Asn Ala Glu Ser Thr 
305 310 315 320 

Glu Arg Glu Ser Gin Asp Thr Val Ala Glu Asn Asp Asp Gly Gly Phe 

325 330 335 

Ser Glu Glu Trp Glu Ala Gin Arg Asp Ser His Leu Gly Pro His Arg 

340 345 350 

Ser Thr Pro Glu Ser Arg Ala Ala Val Gin Glu Leu Ser Ser Ser He 

355 360 365 

Leu Ala Gly Glu Asp Pro Glu Glu Arg Gly Val Lys Leu Gly Leu Gly 

370 375 380 

Asp Phe He Phe Tyr Ser Val Leu Val Gly Lys Ala Ser Ala Thr Ala 
385 390 395 400 

Ser Gly Asp Trp Asn Thr Thr lie Ala Cys Phe Val Ala He Leu lie 

405 410 415 

Gly Leu Cys Leu Thr Leu Leu Leu Leu Ala lie Phe Lys Lys Ala Leu 

420 425 430 

Pro Ala Leu Pro-He Ser lie Thr Phe Gly Leu Val Phe Tyr Phe Ala 

435 440 445 

Thr Asp Tyr Leu Val Gin Pro Phe Met Asp Gin Leu Ala Phe His Gin 

450 455 460 

Phe Tyr lie 
465 

eju#* : 1 1 

cggaattccg tatgctggtt gaaaca 26 

E?IJ#-^ : 1 2 

CGGGATCCTC AGGCTACGAA ACAGGCTAT 29 
EM#f : 1 3 

TATATCAGCG GTATGACCGA CCTCTATGCG TGGGATGAAT ACCGACGTCT GATGGCCGTA 60 

GAACAATAAC CAGGCTTTTG TAAAGACGAA CAATAAATTT TTACCTTTTG CAGAAACTTT 120 

AGTTCGGAAC TTCAGGCTAT AAAACGAATC TGAAGAACAC AGCAATTTTG CGTTATCTGT 180 

TAATCGAGAC TGAAATACAT GAAAAAAACC ACATTAGCAC TGAGTCGACT GGCTCTGAGT 240 

TTAGGTTTGG CGTTATCTCC GCTCTCTGCA ACGGCGGCTG AGACTTCTTC AGCAACGACA 300 

GCCCAGCAGA TGCCAAGCCT TGCACCGATG CTCGAAAAGG TGATGCCTTC AGTGGTCAGC 360 

ATTAACGTAG AAGGTAGCAC AACCGTTAAT ACGCCGCGTA TGCCGCGTAA TTTCCAGCAG 420 

TTCTTCGGTG ATGATTCTCC GTTCTGCCAG GAAGGTTCTC CGTTCCAGAG CTCTCCGTTC 480 

TGCCAGGGTG GCCAGGGCGG TAATGGTGGC GGCCAGCAAC AGAAATTCAT GGCGCTGGGT 540 

TCCGGCGTCA TCATTGATGC CGATAAAGGC TATGTCGTCA CCAACAACCA CGTTCTTGAT 600 

AACGCGACGG TCATTAAAGT TCAACTGAGC GATGGCCGTA AGTTCGACGC GAAGATGGTT 660 

GGCAAAGATC CGCGCTCTGA TATCGCGCTG ATCCAAATCC AGAACCCGAA AAACCTGACC 720 

GCAATTAAGA TGGCGGATTC TGATGCACTG CGCGTGGGTG ATTACACCGT AGGGATTGGT 780 

AACCCGTTTG GTCTGGGCGA GACGGTAACT TCCGGGATTG TCTCTGCGCT GGGGCGTAGC . 840 

GGCCTGAATG CCGAAAACTA CGAAAACTTC ATCCAGACCG ATGCAGCGAT CAACCGTGGT 900 

AACTCCGGTG GTGCGCTGGT TAACCTGAAC GGCGAACTGA TCGGTATCAA CACCGCGATC 960 

CTCGCACCGG ACGGCGGCAA CATCGGTATC GGTTTTGCTA TCCCGAGTAA CATGGTGAAA 1020 

AACCTGACCT CGCAGATGGT GGAATACGGC CAGGTGAAAC GCGGTGAGCT GGGTATTATG 1080 
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51 52 

GGGACTGAGC TGAACTCCGA ACTGGCGAAA GCGATGAAAG TTGACGCCCA GCGCGGTGCT 1140 

TTCGTAAGCC AGGTTCTGCC TAATTCCTCC GCTGCAAAAG CGGGCATTAA AGCGGGTGAT 1200 

GTGATCACCT CACTGAACGG TAAGCCGATC AGCAGCTTTG CCGCACTGCG TGCTCAGGTG 1260 

GGTACTATGC CGGTAGGCAG CAAACTGACC CTGGGCTTAC TGCGCGACGG TAAGCAGGTT 1320 

AACGTGAACC TGGAACTGCA GCAGAGCAGC CAGAATCAGG TTGATTCCAG CTCCATCTTC 1380 

AACGGCATTG AAGGCGCTGA GATGAGCAAC AAAGGCAAAG ATCAGGGCGT GGTAGTGAAC 1440 

AACGTGAAAA CGGGCACTCC GGCTGCGCAG ATCGGCCTGA AGAAAGGTGA TGTGATTATT 1500 

GGCGCGAACC AGCAGGCAGT GAAAAACATC GCTGAACTGC GTAAAGTTCT CGACAGCAAA 1560 

CCGTCTGTGC TGGCACTCAA CATTCAGCGC GGCGACCGCC ATCTACCTGT TAATGCAGTA 1620 

ATCTCCCTCA ACCCCTTCCT GAAAACGGGA AGGGGTTCTC CTTACAATCT GTGAACTTCA 1680 

CCACAACTCC ATACATCTTC ATCATCCTTT AGGCATTTGC ACAATGCCGT ACGTTACGTA 1740 

CTTCCTTATG CTAAGCCGTG CATAACGGAG GACTTATGGC TGGCTGGCAT CTTGATACCA 1800 

AAATGGCGCA GGATATCGTG GCACGTACCA TGCGCATCAT CGATACCAAT ATCA 1854 

EW»» : 1 4 

Met Lys Lys Thr Thr Leu Ala Leu Ser Arg Leu Ala Leu Ser Leu Gly 

1 5 10 .15 

Leu Ala Leu Ser Pro Leu Ser Ala Thr Ala Ala Glu Thr Ser Ser Ala 

20 25 30 

Thr Thr Ala Gin Gin Met Pro Ser Leu Ala Pro Met Leu Glu Lys Val 

35 40 45 

Met Pro Ser Val Val Ser lie Asn Val Glu Gly Ser Thr Thr Val Asn 

50 55 60 

Thr Pro Arg Met Pro Arg Asn Phe Gin Gin Phe Phe Gly Asp Asp Ser 
65 70 75 80 

Pro Phe Cys Gin Glu Gly Ser Pro Phe Gin Ser Ser Pro Phe Cys Gin 

85 90 95 

Gly Gly Gin Gly Gly Asn Gly Gly Gly Gin Gin Gin Lys Phe Met Ala 

100 105 110 

Leu Gly Ser Gly Val He lie Asp Ala Asp Lys Gly Tyr Val Val Thr 

115 120 125 

Asn Asn His Val Val Asp Asn Ala Thr Val He Lys Val Gin Leu Ser 

130 135 140 

Asp Gly Arg Lys Phe Asp Ala Lys Met Val Gly Lys Asp Pro Arg Ser 
145 150 155 160 

Asp He Ala Leu He Gin He Gin Asn Pro Lys Asn Leu Thr Ala He 

165 170 175 

Lys Met Ala Asp Ser Asp Ala Leu Arg Val Gly Asp Tyr Thr Val Gly 

180 ' 185 190 

He Gly Asn Pro Phe Gly Leu Gly Glu Thr Val Thr Ser Gly lie Val 

195 200 205 

Ser Ala Leu Gly Arg Ser Gly Leu Asn Ala Glu Asn Tyr Glu Asn Phe 

210 215 220 

He Gin Thr Asp Ala Ala He Asn Arg Gly Asn Ser Gly Gly Ala Leu 
225 230 235 240 

Val Asn Leu Asn Gly Glu Leu He Gly He Asn Thr Ala He Leu Ala 

245 250 255 

Pro Asp Gly 'Gly Asn lie Gly He Gly Phe Ala He Pro Ser Asn Met 

260 265 270 

Val Lys Asn Leu Thr Ser Gin Met Val Glu Tyr Gly Gin Val Lys Arg 
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53 54 
275 280 285 

Gly Glu Leu Gly He Met Gly Thr Glu Leu Asn Ser Glu Leu Ala Lys 

290 295 300 

Ala Met Lys Val Asp Ala Gin Arg Gly Ala Phe Val Ser Gin Val Leu 
305 310 315 320 

Pro Asn Ser Ser Ala Ala Lys Ala Gly lie Lys Ala Gly Asp Val He 

325 330 335 

Thr Ser Leu Asn Gly Lys Pro He Ser Ser Phe Ala Ala Leu Arg Ala 

340 345 350 

Gin Val Gly Thr Met Pro Val Gly Ser Lys Leu Thr Leu Gly Leu Leu 

355 360 365 

Arg Asp Gly Lys Gin Val Asn Val Asn Leu Glu Leu Gin Gin Ser Ser 

370 375 380 

Gin Asn Gin Val Asp Ser Ser Ser lie Phe Asn Gly He Glu Gly Ala 
385 390 395 400 

Glu Met Ser Asn Lys Gly Lys Asp Gin Gly Val Val Val Asn Asn Val 

405 410 415 

Lys Thr Gly Thr Pro Ala Ala Gin He Gly Leu Lys Lys Gly Asp Val 

420 425 . 430 

He lie Gly Ala Asn Gin Gin Ala Val Lys Asn He Ala Glu Leu Arg 

435 440 445 

Lys Val Leu Asp Ser Lys Pro Ser Val Leu Ala Leu Asn He Gin Arg 

450 455 460 

Gly Asp Arg His Leu Pro Val Asn Ala Val lie Ser Leu Asn Pro Phe 
465 470 475 480 

Leu Lys Thr Gly Arg Gly Ser Pro Tyr Asn Leu 
485 490 

Ki*U#» : 1 5 

CTGGATGGGG AGGTGATTGG AGTG 24 
BM»* : 1 6 

GTCTCTGGGC CCCGGTTGTC TGTTG ' * 25 



*:17 




* 


■ : 


1 3 2 5 -e^Stt 




CCGGCCCTCG 


CCCTGTCCGC 


CGCCACCGCC 


GCCGCCGCCA 


GAGTCGCCAT 


GCAGATCCCG 


60 


CGCGCCGCTC 


TTCTCCCGCT 


GCTGCTGCTG 


CTGCTGGCGG 


CGCCCGCCTC 


GGCGCAGCTG 


120 


TCCCGGGCCG 


GCCGCTCGGC 


GCCTTTGGCC 


GCCGGGTGCC 


CAGACCGCTG 


CGAGCCGGCG 


180 


CGCTGCCCGC 


CGCAGCCGGA 


GCACTGCGAG 


GGCGGCCGGG 


CCCGGGACGC 


GTGCGGCTGC 


240 


TGCGAGGTGT 


GCGGCGCGCC 


CGAGGGCGCC 


GCGTGCGGCC 


TGCAGGAGGG 


CCCGTGCGGC 


300 


GAGGGGCTGC 


AGTGCGTGGT 


GCCCTTCGGG 


GTGCCAGCCT 


CGGCCACGGT 


GCGGCGGCGC 


360 


GCGCAGGCCG 


GCCTCTGTGT 


GTGCGCCAGC 


AGCGAGCCGC 


TGTGCGGCAG 


CGACGCCAAC 


420 


ACCTACGCCA 


ACCTGTGCCA 


GCTGCGCGCC 


GCCAGCCGCC 


GCTCCGAGAG 


GCTGCACCGG 


480 


CCGCCGGTCA 


TCGTCCTGCA 


GCGCGGAGCC 


TGCGGCCAAG 


GGCAGGAAGA 


TCCCAACAGT 


540 


TTGCGCCATA 


AATATAACTT 


TATCGCGGAC 


GTGGTGGAGA 


AGATCGCCCC 


TGCCGTGGTT 


600 


CATATCGAAT 


TGTTTCGCAA 


GCTTCCGTTT 


TCTAAACGAG 


AGGTGCCGGT 


GGCTAGTGGG 


660 


TCTGGGTTTA 


TTGTGTCGGA 


AGATGGACTG 


ATCGTGACAA 


ATGCCCACGT 


GGTGACCAAC 


720 


AAGCACCGGG 


TCAAAGTTGA 


GCTGAAGAAC 


GGTGCCACTT 


ACGAAGCCAA 


AATCAAGGAT 


780 


GTGGATGAGA 


AAGCAGACAT 


CGCACTCATC 


AAAATTGACC 


ACCAGGGCAA 


GCTGCCTGTC 


840 


CTGCTGCTTG 


GCCGCTCCTC 


AGAGCTGCGG 


CCGGGAGAGT 


TCGTGGTCGC 


CATCGGAAGC 


900 


CCGTTTTCCC 


TTCAAAACAC 


AGTCACCACC 


GGGATCGTGA 


GCACCACCCA 


GCGAGGCGGC 


960 
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55 56 
AAAGAGCTGG GGCTCCGCAA CTCAGACATG GACTACATCC AGACCGACGC CATCATCAAC 1020 
TATGGAAACT CGGGAGGCCC GTTAGTAAAC CTGGACGGTG AAGTGATTGG AATTAACACT 1080 
TTGAAAGTGA CAGCTGG AAT CTCCTTTGCA ATCCCATCTG ATAAGATTAA AAAGTTCCTC 1140 
ACGGAGTCCC ATGACCGACA GGCCAAAGGA AAAGCCATCA CCAAGAAGAA GTATATTGGT 1200 
ATCCGAATGA TGTCACTCAC GTCCAGCAAA GCCAAAGAGC TGAAGGACCG GCACCGGGAC 1260 
TTCCCAGACG TGATCTCAGG AGCGTATATA ATTGAAGTAA TTCCTGATAC CCCAGCAGAA 1320 
GCTGKTGGTC TCAAGGAAAA CGACGTCATA ATCAGCATCA ATGGACAGTC CGTGGTCTCC 1380 
GCCAATGATG TCAGCGACGT CATTAAAAGG GAAAGCACCC TGAACATGGT GGTCCGCAGG 1440 
GGTAATGAAG ATATCATGAT CACAGTGATT CCCGAAGAAA TTGACCCATA GGCAGAGGCA 1500 
TGAGCTGGAC TTCATGTTTC CCTCAAAGAC TCTCCCGTGG ATGACGGATG AGGACTCTGG 1560 
GCTGCTGGAA TAGGACACTC AAGACTTTTG ACTGCCATTT TGTTTGTTCA GTGGAGACTC 1620 
CCTGGCCAAC AGAATCCTTC TTGATAGTTT GCAGGCAAAA CAAATGTAAT GTTGCAGATC 1680 
CGCAGGCAGA AGCTCTGCCC TTCTGTATCC TATGTATGCA GTGTGCTTTT TCTTGCCAGC 1740 
TTGGGCCATT CTTGCTTAGA CAGTCAGCAT TTGTCTCCTC CTTTAACTGA GTCATCATCT 1800 
TAGTCCAACT AATGCAGTCG ATACAATGCG TAGATAGAAG AAGCCCCACG GGAGCCAGGA 1860 
TGGGACTGGT CGTGTTTGTG CTTTTCTCCA AGTCAGCACC CAAAGGTCAA TGCACAGAGA 1920 
CCCCGGGTGG GTGAGCGCTG GCTTCTCAAA CGGCCGAAGT TGCCTCTTTT AGGAATCTCT 1980 
TTGGAATTGG GAGCACGATG ACTCTGAGTT TGAGCTATTA AAGTACTTCT TACAAA 2036 
[0 0 8 8] ia?lJ#-*§- : 1 8 " * *#S-2 13G1 y/Va l^Stt 

Met Gin He Pro Arg Ala Ala Leu Leu Pro Leu Leu Leu Leu Leu Leu 

1 5 10 15 

Ala Ala Pro Ala Ser Ala Gin Leu Ser Arg Ala Gly Arg Ser Ala Pro 

20 25 30 

Leu Ala Ala Gly Cys Pro Asp Arg Cys Glu Pro Ala Arg Cys Pro Pro 

35 40 45 

Gin Pro Glu His Cys Glu Gly Gly Arg Ala Arg Asp Ala Cys Gly Cys 

50 55 60 

Cys Glu Val Cys Gly Ala Pro Glu Gly Ala Ala Cys Gly Leu Gin Glu 
65 70 75 80 

Gly Pro Cys Gly Glu Gly Leu Gin Cys Val Val Pro Phe Gly Val Pro 

85 90 95 

Ala Ser Ala Thr Val Arg Arg Arg Ala Gin Ala Gly Leu Cys Val Cys 

100 105 110 

Ala Ser Ser Glu Pro Val Cys Gly Ser Asp Ala Asn Thr Tyr Ala Asn 

115 120 125 

Leu Cys Gin Leu Arg Ala Ala Ser Arg Arg Ser Glu Arg Leu His Arg 

130 135 140 

Pro Pro Val lie Val Leu Gin Arg Gly Ala Cys Gly Gin Gly Gin Glu 
145 150 155 160 

Asp Pro Asn Ser Leu Arg His Lys Tyr Asn Phe He Ala Asp Val Val 

165 170 175 

Glu Lys He Ala Pro Ala Val Val His He Glu Leu Phe Arg Lys Leu 

180 185 190 

Pro Phe Ser Lys Arg Glu Val Pro Val Ala Ser Gly Ser Gly Phe He 

195 200 205 

Val Ser Glu Asp Xaa Leu He Val Thr Asn Ala His Val Val Thr Asn 

210 215 220 

Lys His Arg Val Lys Val Glu Leu Lys Asn Gly Ala Thr Tyr Glu Ala 
225 230 235 240 

Lys He Lys Asp Val Asp Glu Lys Ala Asp He Ala Leu He Lys He 
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57 58 
245 250 255 

Asp His Gin Gly Lys Leu Pro Val Leu Leu Leu Gly Arg Ser Ser Glu 

260 265 270 

Leu Arg Pro Gly Glu Phe Val Val Ala He Gly Ser Pro Phe Ser Leu 

275 280 285 

Gin Asn Thr Val Thr Thr Gly. lie Val Ser Thr Thr Gin Arg Gly Gly 

290 295 300 

Lys Glu Leu Gly Leu Arg Asn Ser Asp Met Asp Tyr He Gin Thr Asp 
305 310 315 320 

Ala lie He Asn Tyr Gly Asn Ser Gly Gly Pro Leu Val Asn Leu Asp 

325 330 335 

Gly Glu Val lie Gly lie Asn Thr Leu Lys Val Thr Ala Gly He Ser 

340 345 350 

Phe Ala lie Pro Ser Asp Lys He Lys Lys Phe Leu Thr Glu Ser His 

355 360 365 

Asp Arg Gin Ala Lys Gly Lys Ala He Thr Lys Lys Lys Tyr He Gly 

370 375 380 

lie Arg Met Met Ser Leu Thr Ser Ser Lys Ala Lys Glu Leu Lys Asp 
385 390 395 400 

Arg His Arg Asp Phe Pro Asp Val lie Ser Gly Ala Tyr He He Glu 

405 410 415 

Val lie Pro Asp Thr Pro Ala Glu Ala Gly Gly Leu Lys Glu Asn Asp 

420 425 430 

Val lie. He Ser He Asn Gly Gin Ser Val Val Ser Ala Asn Asp Val 

435 440 445 

Ser Asp Val lie Lys Arg Glu Ser Thr Leu Asn Met Val Val Arg Arg 

450 455 460 

Gly Asn Glu Asp He Met lie Thr Val lie Pro Glu Glu He Asp Pro 
465 470 475 480 

[0 0 8 9] 30 
Bd?iJ#^ : 1 9 

ATGCTGAACA TCGGCAAAGC TTGGTTCTCG 30 



[0 0 9 0] 



[0 0 9 1 ] 



[0 0 9 2] 



EM** : 2 O 

CCAACAGACA ACCGGGCCCA GAGACT 26 
E*J## : 2 1 

TGCCTCCTCG CCCGCCCTAC TCAGA 25 

mm%- : 2 2 

CGTGGATCCC GAGAAAGAGG CGCAGGACGA GGAGGCAGAA CCCGACTGGC GCGTAGAGCA 60 

GCAGCACGAG CAGTAGGAAG CAGTCACCCG GAAGCCTGGG GGCGAGAGGC GAAGTGGTCA 120 

GGCGCCGAAG GCCGAGAGCA CGCGGGGATC GGTCTCTTCC CGCCGCGTCT CTTACCGGTG 180 

CGAGTCAAAG AGCCGCTCCG GCCCCGGCCC TGAGGGAAGC TCCATAACTG CTGCTTCAGG 240 

AGCGCCCGGC CGTCGCCGCC GCCGCCATTT TCGCGCCCGG CCGCAGGGGC TCTTGGGAAG 300 

GCGGAGTCTT TGGGCATCCG CCCGGGGTGA GGGGACCCGA AGTCCTGAGG CGCGCCGGAA 360 

GGGCTAGCGG TCCCAGCATA CCCCGCGGCC CCTTGGGCCG TCTCACAACT CGCGTCCGGC 420 

GGAGACCACA ATTCCCGGCA TTCGTGCGGC AGGGAGGAGT CGGCCTCCCG GAATCCTGGT 480 

CCOJGCGTGC ACTTCTGAAG GACTTCAGGT ACCGGCGTGC CCCGCGTCCT ACTGTCCGCC 540 

TGCTCGCGTC CTGGGTGCCG CCTCTGAGTA GGGCGGGCGA GGAGGCA 587 



[0 0 9 3] 
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59 60 

EOT** : 2 3 

CGTGGATCCC GAGAAAGAGG CGCAGGACGA GGAGGCAGAA CCCGACTGGC GCGTAGAGCA 60 
GCAGCACGAG CAGTAGGAAG CAGTCACCCG GAAGCCTGGG GGCGAGAGGC GAAGTGGTCA 120 
GGCGCCGAAG GCCGAGAGCA CGCGGGGATC GGTCTCTTCC CGCCGGGTCT CTTACCGGTG 180 
CGAGTCAAAG AGCCGCTCCG GCCCCGGCCC TGAGGGAAGC TCCATAACTG CTGCTTCAGG 240 
AGCGCCCGGC CGTCGCCGCC GCCGCCATTT TCGCGCCCGG CCGCAGGGGC TCTTGGGAAG 300 
GCGGAGTCTT TGGGCATCCG CCCGGGGTGA GGGGACCCGA AGTCCTGAGG CGCGCCGGAA 360 
GGGCTAGCGG TCCCAGCATA CCCCGCGGCC CCTTGGGCCG TCTCACAACT CGCGTCCGGC 420 
GGAGACCACA ATTCCCGGCA TTCGTGGGGC AGGGAGGAGT CGGCCTCCCG GAATCCTGGT 480 
CCCGGCGTGC ACTTCTGAAG GACTTCAGGT ACCGGCGTGC CCCGCGTCCT ACTGTCCGCC 540 
TGCTCGCGTC CTGGGTGCCG CCTCTGAGTA GGGCGGGCGA GGAGGCAGCC AAGGCGGAGC 600 
TG ATG GCT GCG CCG AGG GCG GGG CGG GGT GCA GGC TGG AGC CTT CGG 647 
Met Ala Ala Pro Arg Ala Gly Arg Gly Ala Gly Trp Ser Leu Arg 
1 5 10 15 

GCA TGG CGG GCT TTG GGG GGC ATT TGC TGG GGG AGG AGA CCC CGT TTG 695 
Ala Trp Arg Ala Leu Gly Gly He Cys Trp Gly Arg Arg Pro Arg Leu 

20 25 30 

ACC CCT GAC CTC CGG GCC CTG CTG ACG TCA GGA ACT TCT GAC CCC CGG 743 
Thr Pro Asp Leu Arg Ala Leu Leu Thr Ser Gly Thr Ser Asp Pro Arg 

35 40 45 

GCC CGA GTG ACT TAT GGG ACC CCC AGT CTC TGG GCC CGG TTG TCT GTT 791 
Ala Arg Val Thr Tyr Gly Thr Pro Ser Leu Trp Ala Arg Leu Ser Val 

50 55 60 

GGG GTC ACT GAA CCC CGA GCA TGC CTG ACG TCT GGG ACC CCG GGT CCC 839 
Gly Val Thr Glu Pro Arg Ala Cys Leu Thr Ser Gly Thr Pro Gly Pro 

65 70 75 

CGG GCA CAA CTG ACT GCG GTG ACC CCA GAT ACC AGG ACC CGG GAG GCC 887 
Arg Ala Gin Leu Thr Ala Val Thr Pro Asp Thr Arg Thr Arg Glu Ala 
80 85 90 95 

TCA GAG AAC TCT GGA ACC CGT TCG CGC GCG TGG CTG GCG GTG GCG CTG 935 
Ser Glu Asn Ser Gly Thr Arg Ser Arg Ala Trp Leu Ala Val Ala Leu 

100 ** 105 110 

GGC GCT GGG GGG GCA GTG CTG TTG TTG TTG TGG GGC GGG GGT CGG GGT 983 
Gly Ala Gly Gly Ala Val Leu Leu Leu Leu Trp Gly Gly Gly Arg Gly 

115 120 125 

CCT CCG GCC GTC CTC GCC GCC GTC CCT AGC CCG CCG CCC GCT TCT CCC 1031 
Pro Pro Ala Val Leu Ala Ala Val Pro Ser Pro Pro Pro Ala Ser Pro 

130 135 140 

CGG AGT CAG TAC AAC TTC ATC GCA GAT GTG GTG GAG AAG ACA GCA CCT 1079 
Arg Ser Gin Tyr Asn Phe He Ala Asp Val Val Glu Lys Thr Ala Pro 

145 150 155 

GCC GTG GTC TAT ATC GAG ATC CTG GAC CGG CAC CCT TTC TTG GGC CGC 1127 
Ala Val Val Tyr He Glu He Leu Asp Arg His Pro Phe Leu Gly Arg 
160 165 170 175 

GAG GTC CCT ATC TCG AAC GGC TCA GGA TTC GTG GTG GCT GCC GAT GGG 1175 
Glu Val Pro He Ser Asn Gly Ser Gly Phe Val Val Ala Ala Asp Gly 

180 185 " 190 

CTC ATT GTC ACC AAC GCC CAT GTG GTG GCT GAT CGG CGC AGA GTC CGT 1223 
Leu He Val Thr Asn Ala His Val Val Ala Asp Arg Arg Arg Val Arg 
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61 62 
195 200 205 

GTG AGA CTG CTA AGC GGC GAC ACG TAT GAG GCC GTG GTC ACA GCT GTG 1271 
Val Arg Leu Leu Ser Gly Asp Thr Tyr Glu Ala Val Val Thr Ala Val 

210 215 220 

GAT CCC GTG GCA GAC ATC GCA ACG CTG AGG ATT CAG ACT AAG GAG CCT 1319 
Asp Pro Val Ala Asp He Ala Thr Leu Arg lie Gin Thr Lys Glu Pro 

225 230 235 

CTC CCC ACG CTG CCT CTG GGA CGC TCA GCT GAT GTC CGG CAA GGG GAG 1367 
Leu Pro Thr Leu Pro Leu Gly Arg Ser Ala Asp Val Arg Gin Gly Glu 
240 245 250 255 

TTT GTT GTT GCC ATG GGA AGT CCC TTT GCA CTG CAG AAC ACG ATC ACA 1415 
Phe Val Val Ala Met Gly Ser Pro Phe Ala Leu Gin Asn Thr He Thr 

260 265 270 

TCC GGC ATT GTT AGC TCT GCT CAG CGT CCA GCC AGA GAC CTG GGA CTC 1463 
Ser Gly He Val Ser Ser Ala Gin Arg Pro Ala Arg Asp Leu Gly Leu 

275 280 285 

CCC CAA ACC AAT GTG GAA TAC ATT CAA ACT GAT GCA GCT ATT GAT TTT 1511 
Pro Gin Thr Asn Val Glu Tyr He Gin Thr Asp Ala Ala lie Asp Phe 

290 295 300 

GGA AAC TCT GGA GGT CCC CTG GTT AAC CTG GAT GGG GAG GTG ATT GGA 1559 
Gly Asn Ser Gly Gly Pro Leu Val Asn Leu Asp Gly Glu Val He Gly 

305 310 315 

GTG AAC ACC ATG AAG GTC ACA GCT GGA ATC TCC TTT GCC ATC CCT TCT 1607 
Val Asn Thr Met Lys Val Thr Ala Gly lie Ser Phe Ala lie Pro Ser 
320 325 330 335 

GAT CGT CTT CGA GAG TTT CTG CAT CGT GGG GAA AAG AAG AAT TCC TCC 1655 
Asp Arg Leu Arg Glu Phe Leu His Arg Gly Glu Lys Lys Asn Ser Ser 

340 345 350 

TCC GGA ATC AGT GGG TCC CAG CGG CGC TAC ATT GGG GTG ATG ATG CTG 1703 
Ser Gly lie Ser Gly Ser Gin Arg Arg Tyr He Gly Val Met Met Leu 

355 360 365 

ACC CTG AGT CCC AGC ATC CTT GCT GAA CTA CAG CTT CGA GAA CCA AGC 1751 
Thr Leu Ser Pro Ser He Leu Ala 6lu Leu Gin Leu Arg Glu Pro Ser 

370 375 380 

TTT CCC GAT GTT CAG CAT GGT GTA CTC ATC CAT AAA GTC ATC CTG GGC 1799 
Phe Pro Asp Val Gin His Gly Val Leu He His Lys Val He Leu Gly 

385 390 395 

TCC CCT GCA CAC CGG GCT GGT CTG CGG CCT GGT GAT GTG ATT TTG GCC 1847 
Ser Pro Ala His Arg Ala Gly Leu Arg Pro Gly Asp Val He Leu Ala 
400 405 410 4T5 

ATT GGG GAG CAG ATG GTA CAA AAT GCT GAA GAT GTT TAT GAA GCT GTT 1895 
He Gly Glu Gin Met Val Gin Asn Ala Glu Asp Val Tyr Glu Ala Val 

420 425 430 

CGA ACC CAA TCC CAG TTG GCA GTG CAG ATC CGG CGG GGA CGA GAA ACA 1943 
Arg Thr Gin Ser Gin Leu Ala Val Gin lie Arg Arg Gly Arg Glu Thr 
435 440 445 

CTG ACC TTA TAT GTG ACC CCT GAG GTC ACA GAA TGAATAGATC ACCAAGAGTA 1996 
Leu Thr Leu Tyr Val Thr Pro Glu Val Thr Glu 

450 455 
TGAGGCTCCT GCTCTGATTT CCTCCTTGCC TTTCTGGCTG AGGTTCTGAG GGCACCGAGA 2056 



[0 0 9 4] 
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63 64 
CAGAGCGTTA AATGAACCAG TGGGGGCAGG TCCCTCCAAC CACCAGCACT GACTCCTGGG 2116 
CTCTGAAGAA TCACAGAAAC ACTTTTTATA TAAAATAAAA TTATACCTAG CAACATAAAA 2176 
AAAAAAAAAA A 2187 

B&m^r : 2 4 

CGTGGATCCC GAGAAAGAGG CGCAGGACGA GGAGGCAGAA CCCGACTGGC GCGTAGAGCA 60 
GCAGCACGAG CAGTAGGAAG CAGTCACCCG GAAGCCTGGG GGCGAGAGGC GAAGTGGTCA 120 
GGCGCCGAAG GCCGAGAGCA CGCGGGGATC GGTCTCTTCC CGCCGGGTCT CTTACCGGTG 180 
CGAGTCAAAG AGCCGCTCCG GCCCCGGCCC TGAGGGAAGC TCCATAACTG CTGCTTCAGG 240 
AGCGCCCGGC CGTCGCCGCC GCCGCCATTT TCGCGCCCGG CCGCAGGGGC TCTTGGGAAG 300 
GCGGAGTCTT TGGGCATCCG CCCCGGGTGA GGGGACCCGA AGTCCTGAGG CGCGCCGGAA 360 
GGGCTAGCGG TCCCAGCATA CCCCGCGGCC CCTTGGGCCG TCTCACAACT CGCGTCCGGC 420 
GGAGACCACA ATTCCCGGCA TTCGTGCGCC AGGGAGGAGT CGGCCTCCCG GAATCCTGGT 480 
CCCGGCGTGC ACTTCTGAAG GACTTCAGGT ACCGGCGTGC CCCGCGTCCT ACTGTCCGCC 540 
TGCTCGCGTC CTGGGTGCCG CCTCTGAGTA GGGCGGGCGA GGAGGCAGCC AAGGCGGAGC 600 
TG ATG GCT GCG CCG AGG GCG GGG CGG GGT GCA GGC TGG AGC CTT CGG 647 
Met Ala Ala Pro Arg Ala Gly Arg Gly Ala Gly Trp Ser Leu Arg 
15. 10 15 

GCA TGG CGG GCT TTG GGG GGC ATT CGC TGG GGG AGG AGA CCC CGT TTG 695 
Ala Trp Arg Ala Leu Gly Gly He Arg Trp Gly Arg Arg Pro Arg Leu 

20 25 30 

ACC CCT GAC CTC CGG GCC CTG CTG ACG TCA GGA ACT TCT GAC CCC CGG 743 
Thr Pro Asp Leu Arg Ala Leu Leu Thr Ser Gly Thr Ser Asp Pro Arg 

35 40 45 

GCC CGA GTG ACT TAT GGG ACC CCC ACT CTC TGG GCC CGG TTG TCT GTT 791 
Ala Arg Val Thr Tyr Gly Thr Pro Ser Leu Trp Ala Arg Leu Ser Val 

50 55 60 

GGG GTC ACT GAA CCC CGA GCA TGC CTG ACG TCT GGG ACC CCG GGT CCC 839 
Gly Val Thr Glu Pro Arg Ala Cys Leu Thr Ser Gly Thr Pro Gly Pro 

65 70 75 

CGG GCA CAA CTG ACT GCG GTG ACC CCA GAT ACC AGG ACC CGG GAG GCC 887 
Arg Ala Gin Leu Thr Ala Val Thr Pro Asp Thr Arg Thr Arg Glu Ala 
80 85 90 95 

TCA GAG AAC TCT GGA ACC CGT TCG CGC GCG TGG CTG GCG GTG GCG CTG 935 
Ser Glu Asn Ser Gly Thr Arg Ser Arg Ala Trp Leu Ala Val Ala Leu 

100 105 110 

GGC GCT GGG GGG GCA GTG CTG TTG TTG TTG TGG GGC GGG GGT CGG GGT 983 
Gly Ala Gly Gly Ala Val Leu Leu Leu Leu Trp Gly Gly Gly Arg Gly 

115 » 120 125 

CCT CCG GCC GTC CTC GCC GCC GTC CCT AGC CCG CCG CCC GCT TCT CCC 1031 
Pro Pro Ala Val Leu Ala Ala Val Pro Ser Pro Pro Pro Ala Ser Pro 

130 135 140 

CGG AGT CAG TAC AAC TTC ATC GCA GAT GTG GTG GAG AAG ACA GCA CCT 1079 
Arg Ser Gin Tyr Asn Phe He Ala Asp Val Val Glu Lys Thr Ala Pro 

145 150 155 

GCC GTG GTC TAT ATC GAG ATC CTG GAC CGG CAC CCT TTC TTG GGC CGC 1127 
Ala Val Val Tyr He Glu He Leu Asp Arg His Pro Phe Leu Gly Arg 
160 165 170 175 

GAG GTC CCT ATC TCG AAC GGC TCA GGA TTC GTG GTG GCT GCC GAT GGG 1175 
Glu Val Pro He Ser Asn Gly Ser Gly Phe Val Val Ala Ala Asp Gly 
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65 66 
180 185 190 

CTC ATT GTC ACC AAC GCC CAT GTG GTG GCT GAT CGG CGC AGA GTC CGT 1223 
Leu lie Val Thr Asn Ala His Val Val Ala Asp Arg Arg Arg Val Arg 

195 200 205 

GTG AGA CTG CTA AGC GGC GAC ACG TAT GAG GCC GTG GTC ACA GCT GTG 1271 
Val Arg Leu Leu Ser Gly Asp Thr Tyr Glu Ala Val Val Thr Ala Val 

210 215 220 

GAT CCC GTG GCA GAC ATC GCA ACG CTG AGG ATT CAG ACT AAG GAG CCT 1319 
Asp Pro Val Ala Asp He Ala Thr Leu Arg He Gin Thr Lys Glu Pro 

225 230 235 

CTC CCC ACG CTG CCT CTG GGA CGC TCA GCT GAT GTC CGG CAA GGG GAG 1367 
Leu Pro Thr Leu Pro Leu Gly Arg Ser Ala Asp Val Arg Gin Gly Glu 
240 245 250 255 

TTT GTT GTT GCC ATG GGA AGT CCC TTT GCA CTG CAG AAC ACG ATC ACA 1415 
Phe Val Val Ala Met Gly Ser Pro Phe Ala Leu Gin Asn Thr He Thr 

260 265 270 

TCC GGC ATT GTT AGC TCT GCT CAG CGT CCA GCC AGA GAC CTG GGA CTC 1463 
Ser Gly He Val Ser Ser Ala Gin Arg Pro Ala Arg Asp Leu Gly Leu 

275 280 285 

CCC CAA ACC AAT GTG GAA TAC ATT CAA ACT GAT GCA GCT ATT GAT TTT 1511 
Pro Gin Thr Asn Val Glu Tyr He Gin Thr Asp Ala Ala He Asp Phe 
290 295 300 

GGA AAC TCT GGA GGT CCC CTG GTT AAC CTG GAT GGG GAG GTG ATT GGA 1559 
Gly Asn Ser Gly Gly Pro Leu Val Asn Leu Asp Gly Glu Val He Gly 

305 310 315 

GTG AAC ACC ATG AAG GTC ACA GCT GGA ATC TCC TTT GCC ATC CCT TCT 1607 
Val Asn Thr Met Lys Val Thr Ala Gly lie Ser Phe Ala He Pro Ser 
320 325 330 335 

GAT CGT CTT CGA GAG TTT CTG CAT CGT GGG GAA AAG AAG AAT TCC TCC 1655 
Asp Arg Leu Arg Glu Phe Leu His Arg Gly Glu Lys Lys Asn Ser Ser 

340 345 . 350 

TCC GGA ATC AGT GGG TCC CAG CGG CGC TAC ATT GGG GTG ATG ATG CTG 1703 
Ser Gly He Ser Gly Ser Gin Arg Arg Tyr He Gly Val Met Met Leu 

355 360 365 

ACC CTG AGT CCC AGC ATC CTT GCT GAA CTA CAG CTT CGA GAA CCA AGC 1751 
Thr Leu Ser Pro Ser He Leu Ala Glu Leu Gin Leu Arg Glu Pro Ser 

370 375 380 

TTT CCC GAT GTT CAG CAT GGT GTA CTC ATC CAT AAA GTC ATC CTG GGC 1799 
Phe Pro Asp Val Gin His Gly Val' Leu He His Lys Val He Leu Gly 

385 " 3*90 395 

TCC CCT GCA CAC CGG GCT GGT CTG CGG CCT GGT GAT GTG ATT TTG GCC 1847 
Ser Pro Ala His Arg Ala Gly Leu Arg Pro Gly Asp Val He Leu Ala 
400 405 410 415 

ATT GGG GAG CAG ATG GTA CAA AAT GCT GAA GAT GTT TAT GAA GCT GTT 1895 
He Gly Glu Gin Met Val Gin Asn Ala Glu Asp Val Tyr Glu Ala Val 

420 425 430 

CGA ACC CAA TCC CAG TTG GCA GTG CAG ATC CGG CGG GGA CGA GAA ACA 943 
Arg Thr Gin Ser Gin Leu Ala Val Gin He' Arg Arg Gly Arg Glu Thr 
435 440 445 

CTG ACC TTA TAT GTG ACC CCT GAG GTC ACA GAA TGAATAGATC ACCAAGAGTA 1996 
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67 



68 



Leu Thr Leu Tyr Val Thr Pro Glu Val Thr Glu 
450 455 



TGAGGCTCCT GCTCTGATTT CCTCCTTGCC TTTCTGGCTG AGGTTCTGAG GGCACCGAGA 2056 
CAGAGGGTTA AATGAACCAG TGGGGGCAGG TCCCTCCAAC CACCAGCACT GACTCCTGGG 2116 
CTCTGAAGAA TCACAGAAAC ACTTTTTATA TAAAATAAAA TTATACCTAG CAACATAAAA 2176 



gS?IJ#^ : 2 5 

Met Ala Ala Pro Arg Ala Gly Arg Gly Ala Gly Trp Ser Leu Arg Ala 

1 5 10 15 

Trp Arg Ala Leu Gly Gly He Arg Trp Gly Arg Arg Pro Arg Leu Thr 

20 25 30 

Pro Asp Leu Arg Ala Leu Leu Thr Ser Gly Thr Ser Asp Pro Arg Ala 

35 40 45 

Arg Val Thr Tyr Gly Thr Pro Ser Leu Trp Ala Arg Leu Ser Val Gly 

50 55 60 

Val Thr Glu Pro Arg Ala Cys Leu Thr Ser Gly Thr Pro Gly Pro Arg 
65 70 75 80 

Ala Gin Leu thr Ala Val Thr Pro Asp Thr Arg Thr Arg Glu Ala Ser 

85 90 95 

Glu Asn Ser Gly Thr Arg Ser Arg Ala Trp Leu Ala Val Ala Leu Gly 

100 105 110 

Ala Gly Gly Ala Val Leu Leu Leu Leu Trp Gly Gly Gly Arg Gly Pro 

115 120 125 

Pro Ala Val Leu Ala Ala Val Pro Ser Pro Pro Pro Ala Ser Pro Arg 

130 135 140 

Ser Gin Tyr Asn Phe He Ala Asp Val Val Glu Lys Thr Ala Pro Ala 
145 150 155 160 

Val Val Tyr He Glu He Leu Asp Arg His Pro Phe Leu Gly Arg Glu 

165 170 175 

Val Pro He Ser Asn Gly Ser Gly Phe Val Val AJa Ala Asp Gly Leu 

180 - 185 190 

He Val Thr Asn Ala His Val Val Ala Asp Arg Arg Arg Val Arg Val 

195 200 205 

Arg Leu Leu Ser Gly Asp Thr Tyr Glu Ala Val Val Thr Ala Val Asp 

210 215 220 

Pro Val Ala Asp He Ala Thr Leu Arg He Gin Thr Lys Glu Pro Leu 
225 230 235 240 

Pro Thr Leu Pro Leu Gly. Arg Ser: Ala Asp Val Arg Gin Gly_Glu„Phe 

245 250 255 

Val Val Ala Met Gly Ser Pro Phe Ala Leu Gin Asn Thr He Thr Ser 

260 265 270 

Gly He Val Ser Ser Ala Gin Arg Pro Ala Arg Asp Leu Gly Leu Pro 

275 280 285 

Gin Thr Asn Val Glu Tyr He Gin Thr Asp Ala Ala lie Asp Phe Gly 

290 295 300 

Asn Ser Gly Gly Pro Leu Val Asn Leu Asp Gly Glu Val He Gly Val 
305 310 315 320 

Asn Thr Met Lys Val Thr Ala Gly lie Ser Phe Ala He Pro Ser Asp 



AAAAAAAAAA A 



2187 
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325 



330 



335 
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69 70 
Arg Leu Arg Glu Phe Leu His Arg Gly Glu Lys Lys Asn Ser Ser Ser 

340 345 350 

Gly He Ser Gly Ser Gin Arg Arg Tyr lie Gly Val Met Met Leu Thr 

355 360 365 

Leu Ser Pro Ser He Leu Ala Glu Leu Gin Leu Arg Glu Pro Ser Phe 

370 . 375 380 

Pro Asp Val Gin His Gly Val Leu lie His Lys Val He Leu Gly Ser 
385 390 395 400 

Pro Ala His Arg Ala Gly Leu Arg Pro Gly Asp Val lie Leu Ala He 

405 410 415 

Gly Glu Gin Met Val Gin Asn Ala Glu Asp Val Tyr Glu Ala Val Arg 

420 425 430 

Thr Gin Ser Gin Leu Ala Val Gin He Arg Arg Gly Arg Glu Thr Leu 

435 440 445 

Thr Leu Tyr Val Thr Pro Glu Val Thr Glu 
450 455 

[0 0 9 6] 

E?IJ#^§- : 2 6 

CGTGGATCCC GAGAAAGAGG CGCAGGACGA GGAGGCAGAA CCCGACTGGC GCGTAGAGCA 60 
GCAGCACGAG CAGTAGGAAG CAGTCACCCG GAAGCCTGGG GGCGAGAGGC GAAGTGGTCA 120 
GGCGCCGAAG GCCGAGAGCA CGCGGGGATC GGTCTCTTCC CGCCGGGTCT CTTACCGGTG 180 
CGAGTCAAAG AGCCGCTCCG GCCCCGGCCC TGAGGGAAGC TCCATAACTG CTGCTTCAGG 240 
AGCCCCCGGC CGTCGCCGCC GCCGCCATTT TCGCGCCCGG CCGCAGGGGC TCTTGGGAAG 300 
GCGGAGTCTT TGGGCATCCG CCCGGGGTGA GGGGACCCGA AGTCCTGAGG CGCGCCGGAA 360 
GGGCTAGCGG TCCCAGCATA CCCCGCGGCC CCTTGGGCCG TCTCACAACT CGCGTCCGGC 420 
GGAGACCACA ATTCCCGGCA TTCGTGGGGC AGGGAGGAGT CGGCCTCCCG GAATCCTGGT 480 
CCCGGCGTGC ACTTCTGAAG GACTTCAGGT ACCGGCGTGC CCCGCGTCCT ACTGTCCGCC 540 
TGCTCGCGTC CTGGGTGCCG CCTCTGAGTA GGGCGGGCGA GGAGGCAGCC AAGGCGGAGC 600 
TG ATG GCT GCG CCG AGG GCG GGG CGG GGT GCA GGC TGG AGC CTT CGG 647 
Met Ala Ala Pro Arg Ala Gly Arg Gly Ala Gly Trp Ser Leu Arg 
1 5 10 15 

GCA TGG CGG GCT TTG GGG GGC ATT CGC TGG GGG AGG AGA CCC CGT TTG 695 
Ala Trp Arg Ala Leu Gly Gly He Arg Trp Gly Arg Arg Pro Arg Leu 

20 25 30 

ACC CCT GAC CTC CGG GCC CTG CTG ACG TCA GGA ACT TCT GAC CCC CGG 743 
Thr Pro Asp Leu Arg Ala Leu Leu Thr Ser Gly Thr Ser Asp Pro Arg 

35 40 45 

GCC CGA GTG ACT TAT GGG ACC CCC ACT CTC TGG GCC CGG TTG TCT GTT 791 
Ala Arg Val Thr Tyr Gly Thr Pro" Ser Leu Trp Ala Arg Leu Ser Val 

50 55 60 

GCG GTC ACT GAA CCC CGA GCA TGC CTG ACG TCT GGG ACC CCG GGT CCC 839 
Gly Val Thr Glu Pro Arg Ala Cys Leu Thr Ser Gly Thr Pro Gly Pro 

65 70 75 

CGG GCA CAA CTG ACT GCG GTG ACC CCA GAT ACC AGG ACC CGG GAG GCC 887 
Arg Ala Gin Leu Thr Ala Val Thr Pro Asp Thr Arg Thr Arg Glu Ala 
80 85 90 95 

TCA GAG AAC TCT GGA ACC CGT TCG CGC GCG TGG CTG GCG GTG GCG CTG 935 
Ser Glu Asn Ser Gly Thr Arg Ser Arg Ala Trp Leu Ala Val Ala Leu 

100 105 110 

GGC GCT GGG GGG GCA GTG CTG TTG TTG TTG TGG GGC GGG GGT CGG GGT 983 



(37) 10-117789 
71 72 
Gly Ala Gly Gly Ala .Val Leu Leu Leu Leu Trp Gly Gly Gly Arg Gly 

115 120 125 

CCT CCG GCC GTC CTC GCC GCC GTC Ca AGC CCG CCG CCC GCT TCT CCC 1031 
Pro Pro Ala Val Leu Ala Ala Val Pro Ser Pro Pro Pro Ala Ser Pro 

130 135 140 

CGG AGT CAG TAC AAC TTC ATC GCA GAT GTG GTG GAG AAG ACA GCA CCT 1079 
Arg Ser Gin Tyr Asn Phe He Ala Asp Val Val Glu Lys Thr Ala Pro 

145 150 155 

GCC GTG GTC TAT ATC GAG ATC CTG GAC CGG CAC CCT TTC TTG GGC CGC 1127 
Ala Val Val Tyr lie Glu lie Leu Asp Arg His Pro Phe Leu Gly Arg 
160 165 170 175 

GAG GTC CCT ATC TCG AAC GGC TCA GGA TTC GTG GTG GCT GCC GAT GGG 1175 
Glu Val Pro He Ser Asn Gly Ser Gly Phe Val Val Ala Ala Asp Gly 

180 185 190 

CTC ATT GTC ACC AAC GCC CAT GTG GTG GCT GAT CGG CGC AGA GTC CGT 1223 
Leu He Val Thr Asn Ala His Val Val Ala Asp Arg Arg Arg Val Arg 

195 200 205 

GTG AGA CTG CTA AGC GGC GAC ACG TAT GAG GCC GTG GTC ACA GCT GTG 1271 
Val Arg Leu Leu Ser Gly Asp Thr Tyr Glu Ala Val Val Thr Ala Val 

210 215 220 

GAT CCC GTG GCA GAC ATC GCA ACG CTG AGG ATT CAG ACT AAG GAG CCT 1319 
Asp Pro Val Ala Asp He Ala Thr Leu Arg He Gin Thr Lys Glu Pro 

225 230 235 

CTC CCC ACG CTG CCT CTG GGA CGC TCA GCT GAT GTC CGG CAA GGG GAG 1367 
Leu Pro Thr Leu Pro Leu Gly Arg Ser Ala Asp Val Arg Gin Gly Glu 
240 245 250 255 

TTT GTT GTT GCC ATG GGA AGT CCC TTT GCA CTG CAG AAC ACG ATC ACA 1415 
Phe Val Val Ala Met Gly Ser Pro Phe Ala Leu Gin Asn Thr He Thr 

260 265 270 

TCC GGC ATT GTT AGC TCT GCT CAG CGT CCA GCC AGA GAC CTG GGA CTC 1463 
Ser Gly He Val Ser Ser Ala Gin Arg Pro Ala Arg Asp Leu Gly Leu 

275 280 285 

CCC CAA ACC AAT GTG GAA TAC ATT JaA ACT GAT GCA GCT ATT GAT TTT 1511 
Pro Gin Thr Asn Val Glu Tyr lie Gin Thr Asp Ala Ala He Asp Phe 

290 295 300 

GGA AAC TCT GGA GGT CCC CTG GTT AAC CTG GTG AGT GAG ACA TCC TTC 1559 
Gly Asn Ser Gly Gly Pro Leu Val Asn Leu Val Ser Glu Thr Ser Phe 

305 310 315 

CTT CCA AGA ATC CCT GCC CCA GGT CAG TGT GGG AAG GGT AGG TTT CCC 1607 
Leu Pro Arg He Pro Ala Pro Gly Gin Cys Gly Lys Gly Arg Phe Pro 
320 325 330 335 

CTA ATT CAA GGA TGT TTG GTC AAG TTT CTG AGC AGT TCT TTG TTG GCT 1655 
Leu lie Gin Gly Cys Leu Val Lys Phe Leu Ser Ser Ser Leu Leu Ala 

340 345 350 

ATC TCT CAA TAT CCA ACC AGA TCT CCC CAA CAC TTG CTG GTA CTT TTG 1703 
He Ser Gin Tyr Pro Thr Arg Ser Pro Gin His Leu Leu Val Leu Leu 
355 360 365 

TTC GGG TGC CCC CAT CCC CTA CTA TTT GTT TAGGCTAGGG AACTGGGGGC TGTA 1757 
Phe Gly Cys Pro His Pro Leu Leu Phe Val 
370 375 



[0 0 9 7] 
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73 74 

TCCCTGCAGG ATGGGGAGGT GATTGGAGTG AACACCATGA AGGTCACAGC TGGAATCTCC 1817 

TTTGCCATCC CTTCTGATCG TCTTCGAGAG TTTCTGCATC GTGGGGAAAA GAAGAATTCC 1877 

TCCTCCGGAA TCAGTGGGTC CCAGCGGCGC TACATTGGGG TGATGATGCT GACCCTGAGT 1937 

CCCAGCATCC TTGCTGAACT ACAGCTTCGA GAACCAAGCT TTCCCGATGT TCAGCATGGT 1997 

GTACTCATCC ATAAAGTCAT CCTGGGCTCC CCTGCACACC GGGCTGGTCT GCGGCCTGGT 2057 

GATGTGATTT TGGCCATTGG GGAGCAGATG GTACAAAATG CTGAAGATGT TTATGAAGCT 2117 

GTTCGAACCC AATCCCAGTT GGCAGTGCAG ATCCGGCGGG GACGAGAAAC ACTGACCTTA 2177 

TATGTGACCC CTGAGGTCAC AGAATGAATA GATCACCAAG AGTATGAGGC TCCTGCTCTG 2237 

ATTTCCTCCT TGCCTTTCTG GCTGAGGTTC TGAGGGCACC GAGACAGAGG GTTAAATGAA 2297 

CCAGTGGGGG CAGGTCCCTC CAACCACCAG CACTGACTCC TGGGCTCTGA AGAATCACAG 2357 

AAACACTTTT TATATAAAAT AAAATTATAC CTAGCAACAT ATTATAGTAA AAAATGAGGT 2417 

GGGAGGGCTG GATCTTTTCC CCCACCAAAA GGCTAGAGGT AAAGCTGTAT CCCCCTAAAC 2477 

TTAGGGGAGA TACTGGAGCT GACCATCCTG ACCTCCTATT AAAGAAAATG AGCTGCTGAA 2537 

AAAAAAAAAA AAAA 2551 

mm%- : 2 7 

Met Ala Ala Pro Arg Ala Gly Arg Gly Ala Gly Trp Ser Leu Arg Ala 

1 5 10 15 

Trp Arg Ala Leu Gly Gly He Arg Trp Gly Arg Arg Pro Arg Leu Thr 

20 " 25 30 

Pro Asp Leu Arg Ala Leu Leu Thr Ser Gly Thr Ser Asp Pro Arg Ala 

35 40 45 

Arg Val Thr Tyr Gly Thr Pro Ser Leu Trp Ala Arg Leu Ser Val Gly 

50 55 60 

Val Thr Glu Pro Arg Ala Cys Leu Thr Ser Gly Thr Pro Gly Pro Arg 
65 70 75 80 

Ala Gin Leu Thr Ala Val Thr Pro Asp Thr Arg Thr Arg Glu Ala Ser 

85 90 95 

Glu Asn Ser Gly Thr Arg Ser Arg Ala Trp Leu Ala Val Ala Leu Gly 

100 105 110 

Ala Gly Gly Ala Val Leu Leu Leu Leu Trp Gly Gly Gly Arg Gly Pro 

115 120 125 

Pro Ala Val Leu Ala Ala Val Pro Ser Pro Pro Pro Ala Ser Pro Arg 

130 135 140 

Ser Gin Tyr Asn Phe He Ala Asp Val Val Glu Lys Thr Ala Pro Ala 
145 150 155 160 

Val Val Tyr He Glu He Leu Asp Arg His Pro Phe Leu Gly Arg Glu 

165 170 175 

Val Pro He Ser Asn Gly Ser Gly Phe Val Val Ala Ala Asp Gly Leu 

180 185 190 

He Val Thr Asn Ala His Val Val Ala Asp Arg Arg Arg Val Arg Val 

195 200 205 

Arg Leu Leu Ser Gly Asp Thr Tyr Glu Ala Val Val Thr Ala Val Asp 

210 215 220 

Pro Val Ala Asp He Ala Thr Leu Arg He Gin Thr Lys Glu Pro Leu 
225 230 235 240 

Pro Thr Leu Pro Leu Gly Arg Ser Ala Asp Val Arg Gin Gly Glu Phe 

245 250 255 

Val Val Ala Met Gly Ser Pro Phe Ala Leu Gin Asn Thr lie Thr Ser 
260 265 270 
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75 76 
Gly He Val Ser Ser Ala Gin Arg Pro Ala Arg Asp Leu Gly Leu Pro 

275 280 285 

Gin Thr Asn Val Glu Tyr He Gin Thr Asp Ala Ala lie Asp Phe Gly 

290 295 300 

Asn Ser Gly Gly Pro Leu Val Asn Leu Val Ser Glu Thr Ser Phe Leu 
305 310 315 320 

Pro Arg He Pro Ala Pro Gly Gin Cys Gly Lys Gly Arg Phe Pro Leu 

325 330 335 

He Gin Gly Cys Leu Val Lys Phe Leu Ser Ser Ser Leu Leu Ala He 

340 345 350 

Ser Gin Tyr Pro Thr Arg Ser Pro Gin His Leu Leu Val Leu Leu Phe 

355 360 365 

Gly Cys Pro His Pro Leu Leu Phe Val 
370 375 

EM#* : 2 8 

CGTGGATCCC GAGAAAGAGG CGCAGGACGA GGAGGCAGAA CCCGACTGGC GCGTAGAGCA 60 

GCAGCACGAG CAGTAGGAAG CAGTCACCCG GAAGCCTGGG GGCGAGAGGC GAAGTGGTCA 120 

GGCGCCGAAG GCCGAGAGCA CGCGGGGATC GGTCTCTTCC CGCCGGGTCT CTTACCGGTG 180 

CGAGTCAAAG AGCCGCTCCG GCCCCGGCCC TGAGGGAAGC TCCATAACTG CTGCTTCAGG 240 

AGCGCCCGGC CGTCGCCGCC GCCGCCATTT TCGCGCCCGG CCGCAGGGGC TCTTGGGAAG 300 

GCGGAGTCTT TGGGCATCCG CCCGGGGTGA GGGGACCCGA AGTCCTGAGG CGCGCCGGAA 360 

GGGCTAGCGG TCCCAGCATA CCCCGCGGCC CCTTGGGCCG TCTCACAACT CGCGTCCGGC 420 

GGAGACCACA ATTCCCGGCA TTCGTGGGGC AGGGAGGAGT CGGCCTCCCG GAATCCTGGT 480 

CCCGGCGTGC ACTTCTGAAG GACTTCAGGT ACCGGCGTGC CCCGCGTCCT ACTGTCCGCC 540 

TGCTCGCGTC CTGGGTGCCG CCTCTGAGTA GGGCGGGCGA GGAGGCAGCC AAGGCGGAGC 600 

TG ATG GCT GCG CCG AGG GCG GGG CGG GGT GCA GGC TGG AGC CTT CGG 647 
Met Ala Ala Pro Arg Ala Gly Arg Gly Ala Gly Trp Ser Leu Arg 
15 10 15 

GCA TGG CGG GCT TTG GGG GGC ATT CGC TGG GGG AGG AGA CCC CGT TTG 695 
Ala Trp Arg Ala Leu Gly Gly He Arg Trp Gly Arg Arg Pro Arg Leu 

20 25 30 

ACC CCT GAC CTC CGG GCC CTG CTG AtG TCA GGA ACT.TCT GAC CCC CGG 743 
Thr Pro Asp Leu Arg Ala Leu Leu Thr Ser Gly Thr Ser Asp Pro Arg 

35 40 45 

GCC CGA GTG ACT TAT GGG ACC CCC AGT CTC TGG GCC CGG TTG TCT GTT 791 
Ala Arg Val Thr Tyr Gly Thr Pro Ser Leu Trp Ala Arg Leu Ser Val 

50 55 60 

-GCG-GIC-ACT. . GA A CCC CGA^JGCA-TGC CTG ACG TCT GGG ACC C CG GGT CCC 839 



Gly Val Thr Glu Pro Arg Ala Cys Leu Thr Ser Gly Thr Pro Gly Pro 

65 70 75 

CGG GCA CAA CTG ACT GCG GTG ACC CCA GAT ACC AGG ACC CGG GAG GCC 887 
Arg Ala Gin Leu Thr Ala Val Thr Pro Asp Thr Arg Thr Arg Glu Ala 
80 85 90 95 

TCA GAG AAC TCT GGA ACC CGT TCG CGC GCG TGG CTG GCG GTG GCG CTG 935 
Ser Glu Asn Ser Gly Thr Arg Ser Arg Ala Trp Leu Ala Val Ala Leu 

100 105 110 

GGC GCT GGG GGG GCA GTG CTG TTG TTG TTG TGG GGC GGG GGT CGG GGT 983 
Gly Ala Gly Gly Ala Val Leu Leu Leu Leu Trp Gly Gly Gly Arg Gly 
115 120 125 



(40) ¥rm¥- 10-117789 

77 78 
CCT CCG GCC GTC CTC GCC GCC CTC Ca AGC CCG CCG CCC GCT TCT CCC 1031 
Pro Pro Ala Val Leu Ala Ala Val Pro Ser Pro Pro Pro Ala Ser Pro 

130 135 140 

CGG AGT CAG TAC AAC TTC ATC GCA GAT GTG GTG GAG AAG ACA GCA CCT 1079 
Arg Ser Gin Tyr Asn Phe He Ala Asp Val Val Glu Lys Thr Ala Pro 

145 150 155 

GCC GTG GTC TAT ATC GAG ATC CTG GAC CGG CAC CCT TTC TTG GGC CGC 1127 
Ala Val Val Tyr lie Glu He Leu Asp Arg His Pro Phe Leu Gly Arg 
160 165 170 175 

GAG GTC CCT ATC TCG AAC GGC TCA GGA TTC GTG GTG GCT GCC GAT GGG 1175 
Glu Val Pro He Ser Asn Gly Ser Gly Phe Val Val Ala Ala Asp Gly 

180 185 190 

CTC ATT GTC ACC AAC GCC CAT GTG GTG GCT GAT CGG CGC AGA GTC CGT 1223 
Leu He Val Thr Asn Ala His Val Val Ala Asp Arg Arg Arg Val Arg 

195 200 205 

GTG AGA CTG CTA AGC GGC GAC ACG TAT GAG GCC GTG GTC ACA GCT GTG 1271 
Val Arg Leu Leu Ser Gly Asp Thr Tyr Glu Ala Val Val Thr Ala Val 

210 215 220 

GAT CCC GTG GCA GAC ATC GCA ACG CTG AGG ATT CAG ACT AAG GAG CCT 1319 
Asp Pro Val Ala Asp He Ala Thr Leu Arg He Gin Thr Lys Glu Pro 

225 230 235 

CTC CCC ACG CTG CCT CTG GGA CGC TCA GCT GAT GTC CGG CAA GGG GAG 1367 
Leu Pro Thr Leu Pro Leu Gly Arg Ser Ala Asp Val Arg Gin Gly Glu 
240 245 -250 255 

TTT GTT GTT GCC ATG GGA AGT CCC TTT GCA CTG CAG AAC ACG ATC ACA 1415 
Phe Val Val Ala Met Gly Ser Pro Phe Ala Leu Gin Asn Thr He Thr 

260 265 270 

TCC GGC ATT GTT AGC TCT GCT CAG CGT CCA GCC AGA GAC CTG GGA CTC 1463 
Ser Gly He Val Ser Ser Ala Gin Arg Pro Ala Arg Asp Leu Gly Leu 

275 280 285 

CCC CAA ACC AAT GTG GAA TAC ATT CAA ACT GAT GCA GCT ATT GAT TTT 1511 
Pro Gin Thr Asn Val Glu Tyr lie Gin Thr Asp Ala Ala lie Asp Phe 

290 295 ** -300 

GGA AAC TCT GGA GGT CCC CTG GTT AAC CTG GCT AGG GAA CTG GGG GCT 1559 
Gly Asn Ser Gly Gly Pro Leu Val Asn Leu Ala Arg Glu Leu Gly Ala 

305 310 315 

GTA TCC CTG CAG GAT GGG GAG GTG ATT GGA GTG AAC ACC ATG AAG GTC 1607 
Val Ser Leu Gin Asp Gly Glu Val He Gly Val Asn Thr Met Lys Val 
320 325 330 335 

ACA GCT GGA ATC TCC TTT GCC ATC CCT TCT GAT CGT CTT CGA GAG TTT 1655 
Thr Ala Gly lie Ser Phe Ala He Pro Ser Asp Arg Leu Arg Glu Phe 

340 345 350 

CTG CAT CGT GGG GAA AAG AAG AAT TCC TCC TCC GGA ATC AGT GGG TCC 1703 
Leu His Arg Gly Glu Lys Lys Asn Ser Ser Ser Gly He Ser Gly Ser 

355 360 365 

CAG CGG CGC TAC ATT GGG GTG ATG ATG CTG ACC CTG AGT CCC AGG GCT 1751 
Gin Arg Arg Tyr He Gly Val Met Met Leu Thr Leu Ser Pro Arg Ala 

370 ^ 375 380 

GGT CTG CGG CCT GGT GAT GTG ATT TTG GCC ATT GGG GAG CAG ATG GTA 1799 
Gly Leu Arg Pro Gly Asp Val He Leu Ala He Gly Glu Gin Met Val 



[0 0 9 9] 



(41) WBHSPl 0-117789 

79 80 
385 390 395 

CAA AAT GCT GAA GAT GTT TAT GAA GCT GTT CGA ACC CAA TCC CAG TTG 1847 
Gin Asn Ala Glu Asp Val Tyr Glu Ala Val Arg Thr Gin Ser Gin Leu 
400 405 410 415 

GCA GTG CAG ATC CGG CGG GGA CGA GAA ACA CTG ACC TTA TAT GTG ACC 1895 
Ala Val Gin lie Arg Arg Gly Arg Glu Thr Leu Thr Leu Tyr Val Thr 

420 425 430 

CCT GAG GTC ACA GAA TGAATAGATC ACCAAGAGTA TGAGGCTCCT GCTCTGATTT CC 1952 
Pro Glu Val Thr Glu 
435 

TCCTTGCCTT TCTGGCTGAG GTTCTGAGGG CACCGAGACA GAGGGTTAAA TGAACCAGTG 2012 
GGGGCAGGTC CCTCCAACCA CCAGCACTGA CTCCTGGGCT CTGAAGAATC ACAGAAACAC 2072 
TTTTTATATA AAATAAAATT ATACCTAGCA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 2132 
AAAAAAAAAA AA 2144 

IB?iJ§-*§* : 2 9 

Met Ala Ala Pro Arg Ala Gly Arg Gly Ala Gly Trp Ser Leu Arg Ala 

15 10 15 

Trp Arg Ala Leu Gly Gly lie Arg Trp Gly Arg Arg Pro Arg Leu Thr 

20 25 30 

Pro Asp Leu Arg Ala Leu Leu Thr Ser Gly Thr Ser Asp Pro Arg Ala 

35 40 45 

Arg Val Thr Tyr Gly Thr Pro Ser Leu Trp Ala Arg Leu Ser Val Gly 

50 55 60 

Val Thr Glu Pro Arg Ala Cys Leu Thr Ser Gly Thr Pro Gly Pro Arg 
65 70 75 80 

Ala Gin Leu Thr Ala Val Thr Pro Asp Thr Arg Thr Arg Glu Ala Ser 

85 90 95 

Glu Asn Ser Gly Thr Arg Ser Arg Ala Trp Leu Ala Val Ala Leu Gly 
100 105 110 

. Ala Gly Gly Ala Val Leu Leu Leu Leu Trp Gly Gly Gly Arg Gly Pro 
115 120 125 

Pro Ala Val Leu Ala Ala Val Pro Ser Pro Pro Pro Ala Ser Pro Arg 

130 135 140 

Ser Gin Tyr Asn Phe lie Ala Asp Val Val Glu Lys Thr Ala Pro Ala 
145 150 155 160 

Val Val Tyr He Glu He Leu Asp Arg His Pro Phe Leu Gly Arg Glu 

165 170 175 

Val Pro lie Ser Asn Gly Ser Gly Phe Val Val Ala Ala Asp Gly Leu 

180 185 190 

He Val Thr Asn Ala His Val Val Ala Asp Arg Arg Arg Val Arg Val 

195 200 205 

Arg Leu Leu Ser Gly Asp Thr Tyr Glu Ala Val Val Thr Ala Val Asp 

210 215 220 

Pro Val Ala Asp He Ala Thr Leu Arg He Gin Thr Lys Glu Pro Leu 
225 230 235 240 

Pro Thr Leu Pro Leu Gly Arg Ser Ala Asp Val Arg Gin Gly Glu Phe 

245 250 255 

Val Val Ala Met Gly Ser Pro Phe Ala Leu Gin Asn Thr He Thr Ser 
260 265 270 



(42) ftm¥-l 0-117789 

81 82 
Gly He Val Ser Ser Ala Gin Arg Pro Ala Arg Asp Leu Gly Leu Pro 

275 280 285 

Gin Thr Asn Val Glu Tyr He Gin Thr Asp Ala Ala lie Asp Phe Gly 

290 295 300 

Asn Ser Gly Gly Pro Leu Val Asn Leu Ala Arg Glu Leu Gly Ala Val 
305 310 315 320 

Ser Leu Gin Asp Gly Glu Val He Gly Val Asn Thr Met Lys Val Thr 

325 330 335 

Ala Gly lie Ser Phe Ala lie Pro Ser Asp Arg Leu Arg Glu Phe Leu 

340 345 350 

His Arg Gly Glu Lys Lys Asn Ser Ser Ser Gly He Ser Gly Ser Gin 

355 360 365 

Arg Arg Tyr He Gly Val Met Met Leu Thr Leu Ser Pro Arg Ala Gly 

370 375 380 

Leu Arg Pro Gly Asp Val He Leu Ala He Gly Glu Gin Met Val Gin 
385 390 395 400 

Asn Ala Glu Asp Val Tyr Glu Ala Val Arg Thr Gin Ser Gin Leu Ala 

405 410 415 

Val Gin He Arg Arg Gly Arg Glu Thr Leu Thr Leu Tyr Val Thr Pro 

420 425 430 

Glu Val Thr Glu 
435 

[0 1 O O] ga?iJ#^- : 3 O *7;/S24=Arg/Cys 7;;S2 7 8=A1 

: 6 7 2*5«fctJ<i 4 3 5-e#3H4£« * a/Va 1 



GCGTAGAGCA 


60 


GAAGTGGTCA 


120 


CTTACCGGTG 


180 


CTGCTTCAGG 


240 


TCTTGGGAAG 


300 


CGCGCCGGAA 


360 


CGCGTCCGGC 


420 


GAATCCTGGT 


480 


ACTGTCCGCC 


540 


AAGGCGGAGC 


600 


err cgg 


647 


Leu Arg 




15 





1 5 10 

GCA TGG CGG GCT TTG GGG GGC ATT YGC TGG GGG AGG AGA CCC CGT TTG 695 
Ala Trp Arg Ala Leu Gly Gly He Xaa Trp Gly Arg Arg Pro Arg Leu 



20 25 30 

ACC CCT GAC CTC CGG GCC CTG CTG ACG TCA GGA ACT TCT GAC CCC CGG 743 
Thr Pro Asp Leu Arg Ala Leu Leu Thr Ser Gly Thr Ser Asp Pro Arg 

35 40 45 

GCC CGA GTG ACT TAT GGG ACC CCC AGT CTC TGG GCC CGG TTG TCT GTT 791 
Ala Arg Val Thr Tyr Gly Thr Pro Ser Leu Trp Ala Arg Leu Ser Val 

50 55 60 

GGG GTC ACT GAA CCC CGA GCA TGC CTG ACG TCT GGG ACC CCG GGT CCC 839 
Gly Val Thr Glu Pro Arg Ala Cys Leu Thr Ser Gly Thr Pro Gly Pro 

65 70 75 

CGG GCA CAA CTG ACT GCG GTG ACC CCA GAT ACC AGG ACC CGG GAG GCC 887 



(43) $fffl¥- 10-117789 

83 84 
Arg Ala Gin Leu Thr Ala Val Thr Pro Asp Thr Arg Thr Arg Glu Ala 
80 85 90 95 

TCA GAG AAC TCT GGA ACC CGT TCG CGC GCG TGG CTG GCG GTG GCG CTG 935 
Ser Glu Asn Ser Gly Thr Arg Ser Arg Ala Trp Leu Ala Val Ala Leu 

100 105 110 

GGC GCT GGG GGG GCA GTG CTG TTG TTG TTG TGG GGC GGG GGT CGG GGT 983 
Gly Ala Gly Gly Ala Val Leu Leu Leu Leu Trp Gly Gly Gly Arg Gly 

115 120 125 

CCT CCG GCC GTC CTC GCC GCC GTC CCT AGC CCG CCG CCC GCT TCT CCC 1031 
Pro Pro Ala Val Leu Ala Ala Val Pro Ser Pro Pro Pro Ala Ser Pro 

130 135 140 

CGG AGT CAG TAC AAC TTC ATC GCA GAT GTG GTG GAG AAG ACA GCA CCT 1079 
Arg Ser Gin Tyr Asn Phe lie Ala Asp Val Val Glu Lys Thr Ala Pro 

145 150 155 

GCC GTG GTC TAT ATC GAG ATC CTG GAC CGG CAC CCT TTC TTG GGC CGC 1127 
Ala Val Val Tyr He Glu He Leu Asp Arg His Pro Phe Leu Gly Arg 
160 165 170 175 

GAG GTC CCT ATC TCG AAC GGC TCA GGA TTC GTG GTG GCT GCC GAT GGG 1175 
Glu Val Pro lie Ser Asn Gly Ser Gly Phe Val Val Ala Ala Asp Gly 

180 185 190 

CTC ATT GTC ACC AAC GCC CAT GTG GTG GCT GAT CGG CGC AGA GTC CGT 1223 
Leu He Val Thr Asn Ala His Val Val Ala Asp Arg Arg Arg Val Arg 

195 200 205 

GTG AGA CTG CTA AGC GGC GAC ACG TAT GAG GCC GTG GTC ACA GCT GTG 1271 
Val Arg Leu Leu Ser Gly Asp Thr Tyr Glu Ala Val Val Thr Ala Val 

210 215 220 

GAT CCC CTG GCA GAC ATC GCA ACG CTG AGG ATT CAG ACT AAG GAG CCT 1319 
Asp Pro Val Ala Asp He Ala Thr Leu Arg He Gin Thr Lys Glu Pro 

225 230 235 

CTC CCC ACG CTG CCT CTG GGA CGC TCA GCT GAT GTC CGG CAA GGC GAG 1367 
Leu Pro Thr Leu Pro Leu Gly Arg Ser Ala Asp Val Arg Gin Gly Glu 
240 245 250 255 

TTT GTT GTT GCC ATG GGA AGT CCC TtT GCA CTG CAG- AAC ACG ATC ACA 1415 
Phe Val Val Ala Met Gly Ser Pro Phe Ala Leu Gin Asn Thr He Thr 

260 265 270 

TCC GGC ATT GTT AGC TCT YCT CAG CGT CCA GCC AGA GAC CTG GGA CTC 1463 
Ser Gly He Val Ser Ser Xaa Gin Arg Pro Ala Arg Asp Leu Gly Leu 
275 280 285 

-CC(^M^GC-AA«-TC^ 151.1 

Pro Gin Thr Asn Val Glu Tyr He Gin Thr Asp Ala Ala He Asp Phe 

290 295 300 

GGA AAC TCT GGA GGT CCC CTG GTT AAC CTG GAT GGG GAG GTG ATT GGA 1559 
Gly Asn Ser Gly Gly Pro Leu Val Asn Leu Asp Gly Glu Val He Gly 

305 310 315 

GTG AAC ACC ATG AAG GTC ACA GCT GGA ATC TCC TTT GCC ATC CCT TCT 1607 
Val Asn Thr Met Lys Val Thr Ala Gly He Ser Phe Ala lie Pro Ser 
320 325 330 335 

GAT CGT CTT CGA GAG' TTT CTG CAT CGT GGG GAA AAG AAG AAT TCC TCC 1655 
Asp Arg Leu Arg Glu Phe Leu His Arg Gly Glu Lys Lys Asn Ser Ser 
340 345 350 



(44) ftmW-l 0-117789 

85 86 
TCC GGA ATC AGT GGG TCC CAG CGG CGC TAC ATT GGG GTG ATG ATG CTG 1703 
Ser Gly He Ser Gly Ser Gin Arg Arg Tyr He Gly Val Met Met Leu 

355 360 365 

ACC CTG AGT CCC AGC ATC CTT GCT GAA CTA CAG CTT CGA GAA CCA AGC 1751 
Thr Leu Ser Pro Ser He Leu Ala Glu Leu Gin Leu Arg Glu Pro Ser 

370 375 380 

TTT CCC GAT GTT CAG CAT GGT GTA CTC ATC CAT AAA GTC ATC CTG GGC 1799 
Phe Pro Asp Val Gin His Gly Val Leu He His Lys Val He Leu Gly 

385 390 395 

TCC CCT GCA CAC CGG GCT GGT CTG CGG CCT GGT GAT GTG ATT TTG GCC 1847 
Ser Pro Ala His Arg Ala Gly Leu Arg Pro Gly Asp Val He Leu Ala 
400 405 410 415 

ATT GGG GAG CAG ATG GTA CAA AAT GCT GAA GAT GTT TAT GAA GCT GTT 1895 
He Gly Glu Gin Met Val Gin Asn Ala Glu Asp Val Tyr Glu Ala Val 

420 425 430 

CGA ACC CAA TCC CAG TTG GCA GTG CAG ATC CGG CGG GGA CGA GAA ACA 1943 
Arg Thr Gin Ser Gin Leu Ala Val Gin He Arg Arg Gly Arg Glu Thr 

435 440 445 

CTG ACC TTA TAT GTG ACC CCT GAG GTC ACA GAA TGAATAGATC ACCAAGAGTA 1996 
Leu Thr Leu Tyr Val Thr Pro Glu Val Thr Glu 

450 455 
TGAGGCTCCT GCTCTGATTT CCTCCTTGCC TTTCTGGCTG AGGTTCTGAG GGCACCGAGA 2056 
CAGAGGGTTA AATGAACCAG TGGGGGCAGG TCCCTCCAAC CACCAGCACT GACTCCTGGG 2116 
CTCTGAAGAA TCACAGAAAC ACTTTTTATA TAAAATAAAA TTATACCTAG CAACATAAAA 2176 
AAAAAAAAAA A 2187 
[OlOl] E3?iJ#-i§- : 31 *24Xaa=Ar g S/hfiC y s 

* 2 7 8 X a a = A 1 a S/cteV a 1 
Met Ala Ala Pro Arg Ala Gly Arg Gly Ala Gly Trp Ser Leu Arg Ala 

1 5 10 15 

Trp Arg Ala Leu Gly Gly He Xaa Trp Gly Arg Arg Pro Arg Leu Thr 

20 25 30 

Pro Asp Leu Arg Ala Leu Leu Thr Ser Gly Thr Ser Asp Pro Arg Ala 

35 40 * - 45 

Arg Val Thr Tyr Gly Thr Pro Ser Leu Trp Ala Arg Leu Ser Val Gly 

50 55 60 

Val Thr Glu Pro Arg Ala Cys Leu Thr Ser Gly Thr Pro Gly Pro Arg 
65 70 75 80 

Ala Gin Leu Thr Ala Val Thr Pro Asp Thr Arg Thr Arg Glu Ala Ser 
85 90 95 



Glu Asn Ser Gly Thr Arg~Ser Arg"ATa~frpTleu~ ATa~ValTlTTe'u~Gl7 

100 105 110 

Ala Gly Gly Ala Val Leu Leu Leu Leu Trp Gly Gly Gly Arg Gly Pro 

115 120 125 

Pro Ala Val Leu Ala Ala Val Pro Ser Pro Pro Pro Ala Ser Pro Arg 

130 135 140 

Ser Gin Tyr Asn Phe He Ala Asp Val Val Glu Lys Thr Ala Pro Ala- 
145 150 155 160 

Val Val Tyr He Glu He Leu Asp Arg His Pro Phe Leu Gly Arg Glu 

165 170 175 

Val Pro He Ser Asn Gly Ser Gly Phe Val Val Ala Ala Asp Gly Leu 



(45) 
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87 88 
180 185 190 

He Val Thr Asn Ala His Val Val Ala Asp Arg Arg Arg Val Arg Val 

195 200 205 

Arg Leu Leu Ser Gly Asp Thr Tyr Glu Ala Val Val Thr Ala Val Asp 

210 215 220 

Pro Val Ala Asp He Ala Thr Leu Arg He Gin Thr Lys Glu Pro Leu 
225 230 235 240 

Pro Thr Leu Pro Leu Gly Arg Ser Ala Asp Val Arg Gin Gly Glu Phe 

245 250 255 

Val Val Ala Met Gly Ser Pro Phe Ala Leu Gin Asn Thr lie Thr Ser 

260 265 270 

Gly He Val Ser Ser Xaa Gin Arg Pro Ala Arg Asp Leu Gly Leu Pro 

275 280 285 

Gin Thr Asn Val Glu Tyr He Gin Thr Asp Ala Ala He Asp Phe Gly 

290 295 300 

Asn Ser Gly Gly Pro Leu Val Asn Leu Asp Gly Glu Val He Gly Val 
305 310 315 320 

Asn Thr Met Lys Val Thr Ala Gly He Ser Phe Ala He Pro Ser Asp 

325 330 335 

Arg Leu Arg Glu Phe Leu His Arg Gly Glu Lys Lys Asn Ser Ser Ser 

340 345 350 

Gly He Ser Gly Ser Gin Arg Arg Tyr He Gly Val Met Met Leu Thr 

355 360 365 

Leu Ser Pro Ser lie Leu Ala Glu Leu Gin Leu Arg Glu Pro Ser Phe 

370 375 380 

Pro Asp Val Gin His Gly Val Leu He His Lys Val He Leu Gly Ser 
385 390 395 400 

Pro Ala His Arg Ala Gly Leu Arg Pro Gly Asp Val He Leu Ala He 

405 410 415 

Gly Glu Gin Met Val Gin Asn Ala Glu Asp Val Tyr Glu Ala Val Arg 

420 425 430 

Thr Gin Ser Gin Leu Ala Val Gin He Arg Arg Gly Arg Glu Thr Leu 

435 440 * . 445 

Thr Leu Tyr Val Thr Pro Glu Val Thr Glu 



450 

[0102] gS?"J##* : 3 2 
CATCCGGCAT TGTTAGCTCT GC 22 

[0103] EJIJ#* : 3 3 
CATCCGGCAT TGTTAGCTCT GT 22 

[0104] gSJ!J#-S§- : 3 4 
CAATAGCTGC ATCAGTTTGA ATG 23 

[0105] E?J#* : 3 5 
TGGCGGGCTT TGGGGGGCAT TC 22 

[0106] ga?IJ#^- : 3 6 
TGGCGGGCTT TGGGGGGCAT TT 22 

[0107] ga?iJ#-*§- : 3 7 
GACGTCAGCA GGGCCCGGAG GTC 23 

[0108] E#I##- : 3 8 
CATACCCCAG CAGAAGCTGG 20 

[0109] E9J## : 3 9 



455 



CATACCCCAG CAGAAGCTGT 20 

[0 110] M^m^ : 4 0 
GCTGACATCA TTGGCGGAGA C 21 

- imm&ffimtemm] 

40 imi] 4— • =» y (E. coli) htrAtPSP-l 

[[212] PSP1*B*PSP1-K PSP1- 
2, PSPl-3*5£tfPSPl-4^aft^cDNA 

[13 3] PSPl^SlftPSPl-ls PSP1- 
2, PSPl-3t5£tfPSPl-4WaftOcDNA 
E^JSrttWb L & t> <DX~t> 

[EI4 ] "PSPl &m&P SP1-K PSP1- 
2, PSPl-3teitfPSPl-4(09ft©cDNA 
50 SS?1J £r ffi?'Kfc; L fc fc (O "C fc S . 



89 

[m 5 ] P S P 1 ¥8f#P SP1-1, PSP1- 
2, P S P 1 - 3*S,fct/P S P 1 - 4<D&%t<Dc DNA 

E#J SrS?iJ{t: L/^t^-efc^ 

[0 6] P S P 1 ¥B§&P SP1-1, PSP1- 
2. P S P 1 - 3*3ctVP S P 1 -A<DW%L<Oc DNA 

[EI 7 ] PSPl*i*PSPl-l, PSP1- 
2. P S P 1 - 3 fcctl^P S P 1 - 4 <7)SS^ c DNA 

e m % m&Ht u ft t> ^ fc s o 

[H 8 ] P S P 1 *fHffcP SP1-1, P S P 1 - 



4#BB¥l 0- 1 1 7 7 8 9 

90 

2. P S P 1 -3*3«fcl*P S P 1 -4 0fiSOc DNA 

EM*SMft;Lfct>«>-e*>a. 

[B9] PSP1JM#PSP1-U PSP1- 
2, PSPl-3*5itfPSPl-4«)ttft«)cDNA 

e?v *&mit ufc t co -cfc So 

[HI 1 0 ] P S P 1 m8t#P SPl-l., PSPl- 
2, PSPl-3*itfPSPl-40Sft«)cDNA 
EM«r*WftL*:t(0-C*>a. 

[[Ull] «Sfc h-fe y V^ci^T— ifT*P S P 1 - 
l <dt % >>lfcEW**M{bL*: 



[01] 



• • • » 

41 SGTSDPRARVTYGTPSLWARLSVGVTEPRACLTSGTPGPRAQLTAVTPDT 90 

. |. : .|s..: . I I.. . - I - : - - 

2 KKTTLALSiUjALSLGLALSPLSATAAETSSATTAQQMPSIAPMLEKVMPS 51 

. . . . • 

91 RTREASENSGTRSRAWLAVALGAGGAVLLLLWGGGRG PPAVLAAVPS PPP 14 0 
. . . 1 . | . | . : : : : . : . . : : I : : . . : . . .11 . 

52 WSINVEGS TTVUT PRMPRN FQQ FFGDD SPFCQEGSPFQ 90 

, • • • 

141 AS PRSQ YN FIAD WEKTAPAWY I E I LDRH P FLGREVPI SNGSGFWAAD m 190 
. j | : | : : . : I I I . : : ♦ I I 

91 SSPFCQ. . . GGQGGNGGGQQQKFMAL GSGVIIDAD 122 

• • • ' 

191 .GLIVTNAHWADRRRVRVRLLSGDTYEAVVTAVDPVADIATLRIQTKEP 23 9 

| . : | 1 | . I I I . : : : I • I -I . : : I : . : I 1 -III : . I I : - • 
123 KGYWTNUHWDNATVIKVQLSDGRKFDAKMVGKDPRSDIALIQIQNPKN 172 

240 LPTLPLGRSADVRQGEFWAMGSPFALQNTITSGIVSSAQRPARDLGLPQ 289 

I . .:. : : I . - : I I : : . I : : I . I I s I . : I : I I I I I I • • I • II • 
173 LTAI KMADS DALRVGD YT VGIGN P FGLGETVTSGI VS ALGRS GLNA 218 

290 TNVE.YIQTDAAIDFGNSGGPLVNLDGEYIGVNTMKVTA GISFAI 333 

.| | : | I | II | I : I I II Js I II !:! |:| I : I I : • : 11 = 111 

219 ENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAI 268 

> 

334 PSDRLREF LHRGE J 346 

II:.:::: : . I I I 

269 PSNMVKNLTSQMVEYGQVKRGELGI^^ 318 

347 . KKNSSSGISG SQRRYIGVM MLTL. . . 369 

. I I | . : . I - I : I • I ? I I I 

319 VLPNS S AAKAG IKAG DVITSLNGKPISS FAALRAQVGTMP VGS KLTLGLL 348. 

370 ...... 7 77 7777777: ^ r^TSPSIIAEIiQLREPSFPDVQHGVL-rHKVrir 398 

| . | | : . : : : | | . : : : I I : : : - I 
369 RDG KQVN VNLELQQS S QNQ V DS SSI FNG I EGAEMS NKG KDQG VWNNVKT 418 

399 GSPAHRAGLRPGDVIIiAIGEQMVQNAEDVYEAVRTQ . SQLAVQIRRGRET 4 47 

| . | | . | I I I I I : : . : I I . I . :: . . : • • I I I I - I I 
419 GT PAAQIGLKKGDVT IGANQQAVKN I AELRKVLDSKPS VLALN IQRG DRH 4 68 

4 48 LTLYVTPEVTE 458 
I.: 

4 69 LPVNAVISLNP 4 79 



(47) 
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[02] 

1 . 50 
PSP1-2 CGTGGATCCC GAGAAAGAGG CGCAGGACGA GGAGGCAGAA CCCGACTGGC 
PSP1-1 CGTGGATCCC GAGAAAGAGG CGCAGGACGA GGAGGCAGAA CCCGACTGGC 
PSP1-3 CGTGGATCCC GAGAAAGAGG CGCAGGACGA GGAGGCAGAA CCCGACTGGC 
PSP1-4 CGTGGATCCC GAGAAAGAGG CGCAGGACGA GGAGGCAGAA CCCGACTGGC 

51 100 
PSPl^-2 GCGTAGAGCA GCAGCACGAG CAGTAGGAAG CAGTCACCCG GAAGCCTGGG 
PSP1-1 GCGTAGAGCA GCAGCACGAG CAGTAGGAAG CAGTCACCCG GAAGCCTGGG 
PSP1-3 GCGTAGAGCA GCAGCACGAG CAGTAGGAAG CAGTCACCCG GAAGCCTGGG 
PSP1-4 GCGTAGAGCA GCAGCACGAG CAGTAGGAAG CAGTCACCCG GAAGCCTGGG 



101 

PS P 1-2 GGCGAGAGGC GAAGTGGTCA 
PSP1-1 GGCGAGAGGC GAAGTGGTCA 
PSP1-3 GGCGAGAGGC GAAGTGGTCA 
PSP1-4 GGCGAGAGGC GAAGTGGTCA 

151 

PSP1-2 GGTCTCTTCC CGCCGGGTCT 
PSP1-1 GGTCTCTTCC CGCCGGGTCT 
PSP1-3 GGTCTCTTCC CGCCGGGTCT 
PSP1-4 GGTCTCTTCC CGCCGGGTCT 

201 

PSPl-2 GCCCCGGCCC TGAGGGAAGC 
PSP1-1 GCCCCGGCCC TGAGGGAAGC 
PSP1-3 GCCCCGGCCC TGAGGGAAGC 
PSPl-4 -GCCCCGGCCC TGAGGGAAGC 

251 

PS PI -2 CGTCGCCGCC GCCGCCATTT 
PSP1-1 CGTCGCCGCC GCCGCCATTT 
PSP1-3 CGTCGCCGCC GCCGCCATTT 
PS P 1-4" CGTCGCCGCC GCCGCCATTT 



150 

GGCGCCGAAG GCCGAGAGCA CGCGGGGATC 
GGCGCCGAAG GCCGAGAGCA CGCGGGGATC, 
GGCGCCGAAG GCCGAGAGCA CGCGGGGATC 
GGCGCCGAAG GCCGAGAGCA CGCGGGGATC 

200 

CTTACCGGTG CGAGTCAAAG AGCCGCTCCG 
CTTACCGGTG CGAGTCAAAG AGCCGCTCCG 
CTTACCGGTG CGAGTCAAAG AGCCGCTCCG 
CTTACCGGTG CGAGTCAAAG AGCCGCTCCG 

' 250 

TCCATAACTG CTGCTTCAGG AGCGCCCGGC 
TCCATAACTG CTGCTTCAGG AGCGCCCGGC 
TCCATAACTG CTGCTTCAGG AGCGCCCGGC 
"TCCATAACTG CTGCTTCAGG AGCGCCCGGC 

300 

TCGCGCCCGG CCGCAGGGGC TCTTGGGAAG 
TCGCGCCCGG CCGCAGGGGC TCTTGGGAAG 
TCGCGCCCGG CCGCAGGGGC TCTTGGGAAG 
TCGCGCCCGG CCGCAGGGGC TCTTGGGAAG 



mfflW-l 0-117789 



[H3] 



301 

PS PI- 2 GCGGAGTCTT TGGGCATCCG 
PSP1-1 GCGGAGTCTT TGGGCATCCG 
PSP1-3 GCGGAGTCTT TGGGCATCCG 
PSP1-4 GCGGAGTCTT TGGGCATCCG 



350 

CCCGGGGTGA GGGGACCCGA AGTCCTGAGG 
CCCGGGGTGA GGGGACCCGA AGTCCTGAGG 
CCCGGGGTGA GGGGACCCGA AGTCCTGAGG 
CCCGGGGTGA GGGGACCCGA AGTCCTGAGG 



351 400. 
PSP1-2 CGCGCCGGAA GGGCTAGCGG TCCCAGCATA CCCCGCGGCC CCTTGGGCCG 
PSP1-1 CGCGCCGGAA GGGCTAGCGG TCCCAGCATA CCCCGCGGCC CCTTGGGCCG 
PSP1-3 CGCGCCGGAA GGGCTAGCGG TCCCAGCATA CCCCGCGGCC CCTTGGGCCG 
PSP1-4 CGCGCCGGAA GGGCTAGCGG TCCCAGCATA CCCCGCGGCC CCTTGGGCCG 

40r 450 
PSPl-2 TCTCACAACT CGCGTCCGGC GGAGACCACA ATTCCCGGCA TTCGTGGGGC 
PSP1-1 TCTCACAACT CGCGTCCGGC GGAGACCACA ATTCCCGGCA TTCGTGGGGC 
PSP1-3 TCTCACAACT CGCGTCCGGC GGAGACCACA ATTCCCGGCA TTCGTGGGGC 
PSP1-4 TCTCACAACT CGCGTCCGGC. GGAGACCACA ATTCCCGGCA TTCGTGGGGC 

451 500 
PSPl-2 AGGGAGGAGT CGGCCTCCCG GAATCCTGGT CCCGGCGTGC ACTTCTGAAG 
PSPl-1 AGGGAGGAGT CGGCCTCCCG GAATCCTGGT CCCGGCGTGC ACTTCTGAAG 
PSP1-3 AGGGAGGAGT CGGCCTCCCG GAATCCTGGT CCCGGCGTGC ACTTCTGAAG 
PSP1-4 AGGGAGGAGT CGGCCTCCCG GAATCCTGGT CCCGGCGTGC ACTTCTGAAG 

501 550 
PSP1-2 GACTTCAGGT ACCGGCGTGC CpCGCGTCCT ACTGTCCGCC TGCTCGCGTC 
PSP1-1 GACTTCAGGT ACCGGCGTGC CCCGCGTCCT ACTGTCCGCC TGCTCGCGTC 
PSP1-3 GACTTCAGGT ACCGGCGTGC CCCGCGTCCT ACTGTCCGCC TGCTCGCGTC 
PSP1-4 GACTTCAGGT ACCGGCGTGC CCCGCGTCCT ACTGTCCGCC TGCTCGCGTC 

551 600 
PS PI -2 CTGGGTGCCG CCTCTGAGTA GGGCGGGCGA GGAGGCAGCC AAGGCGGAGC 
PSPl-1 CTGGGTGCCG CCTCTGXGTA^GGCGGGCGA _ GGAGGCAGCC~AA^GCGGAGC- 
PSP1-3 CTGGGTGCCG CCTCTGAGTA GGGCGGGCGA GGAGGCAGCC AAGGCGGAGC 
PSP1-4 CTGGGTGCCG CCTCTGAGTA GGGCGGGCGA GGAGGCAGCC AAGGCGGAGC 



(49) 
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[04] 

SOI 650 



PSP1- 


■2 


TGATGGCTGC 


G CCG AGGG CG 


GGGCGGGGTG 


CAGGCTGGAG 


CCTTCGGGCA 


PSP1- 


■1 


TGATGGCTGC 


GCCGAGGG CG 


GGGCGGGGTG 


CAGGCTGGAG 


CCTTCGGGCA 


PSP1- 


*3 


TGATGGCTGC 


GCCGAGGGCG 


GGGCGGGGTG 


CAGGCTGGAG 


CCTTCGGGCA 


PSP1- 


-4 


TGATGGCTGC 


GCCGAGGGCG 


GGGCGGGGTG 


CAGGCTGGAG 


CCTTCGGGCA 






651 








700 


PSP1- 


2 


TGGCGGGCTT 


TGGGGGGCAT 


TTGCTGGGGG 


AGGAGACCCC 


GTTTGACCCC 


PSPl-^ 


1 


TGGCGGGCTT 


TGGGGGGCAT 


TCGCTGGGGG 


AGGAGACCCC 


GTTTGACCCC 


PSP1- 


3 


TGGCGGGCTT 


TGGGGGGCAT 


TCGCTGGGGG 


AGGAGACCCC 


GTTTGACCCC 


PSP1- 


4 


TGGCGGGCTT 


TGGGGGGCAT 


TCGCTGGGGG 


AGGAGACCCC 


GTTTGACCCC 






701 








--■> 750 


PSPl- 


2 


TGACCTCCGG 


GCCCTGCTGA 


CGTCAGGAAC 


TTCTGA.CCCC 


CGGGCCCGAG 


.PSP1- 


1 


TGACCTCCGG 


GCCCTGCTGA 


CGTCAGGAAC 


TTCTGACCCC 


CGGGCCCGAG 


PSP1- 


3 


TGACCTCCGG 


GCCCT GCT G A 


CGTCAGGAAC 


TTCTGACCCC 


CGGGCCCGAG 


PSP1- 


4 


TGACCTCCGG 


GCCCTGCTGA 


CGTCAGGAAC 


TTCTGACCCC 


CGGGCCCGAG 






TCI 

/Ox 








OUU 


PSP1- 


2 


TGACTTATGG 


GACCCCCAGT 


CTCTGGGCCC 


GGTTGTCTGT 


TGGGGTCACT 


QCD1 _ 


1 








GGTTGTCTGT 


TGGGGTCACT 


OCD1 _ 




X JL InAuVs 


GACCCCCAGT 


CTCTGGGCCC 


GGTTGTCTGT 


TGGGGTCACT 


PSP1- 


4 


TGACTTATGG 


GACCCCCAGT 


CTCTGGGCCC 


GGTTGTCTGT 


TGGGGTCACT 






801 








850 


PSP1- 


2 


GAACCCCGAG 


CATGCCTGAC 


GTCTGGGACC 


CCGGGTCCCC 


GGGCACAACT 


PSP1- 


1 


GAACCCCGAG 


CATGCCTGAC 


GTCTGGGACC 


CCGGGTCCCC 


GGG CACAACT 


PSP1- 


3 


GAACCCCGAG 


CATGCCTGAC 


GTCTGGGACC 


CCGGGTCCCC 


GGGCACAACT 


PSP1- 


4 


GAACCCCGAG 


CATGCCTGAC 


GTCTGGGACC 


CCGGGTCCCC 


GGGCACAACT 






851 








900 


PSP1- 


2 


GACTGCGGTG 


ACCCCAGATA 


CCAGGACCCG 


GGAGGCCTCA GAGAACTCTG 


PSP1- 


1 


GACTGCGGTG" 


ACCCCAGATA 


CCAGGACCCCT 


~GGAGGCCTCA~GAGAACTCTG 


PSP1- 


3 


GACTGCGGTG 


ACCCCAGATA CCAGGACCCG 


GGAGGCCTCA 


GAGAACTCTG 


PSP1- 


4 


GACTGCGGTG 


ACCCCAGATA 


CCAGGACCCG 


GGAGGCCTCA 


GAGAACTCTG 




(50) ftMW- 10-117789 



[i5] 

901 950 
PSP1-2 GAACCCGTTC GCGCGCGTGG' CTGGCGGTGG CGCTGGGCGC TGGGGGGGCA 
PSP1-1 GAACCCGTTC GCGCGCGTGG CTGGCGGTGG CGCTGGGCGC TGGGGGGGCA 
PSP1-3 GAACCCGTTC GCGCGCGTGG . CTGGCGGTGG CGCTGGGCGC TGGGGGGGCA 
PSP1-4 GAACCCGTTC GCGCGCGTGG CTGGCGGTGG CGCTGGGCGC TGGGGGGGCA 

951 1000 
PSP1-2 GTGCTGTTGT TGTTGTGGGG CGGGGGTCGG GGTCCTCCGG CCGTCCTCGC 
PSPi-1 GTGCTGTTGT TGTTGTGGGG CGGGGGTCGG GGTCCTCCGG CCGTCCTCGC 
PSP1-3 GTGCTGTTGT TGTTGTGGGG CGGGGGTCGG GGTCCTCCGG CCGTCCTCGC 
PSP1-4 GTGCTGTTGT TGTTGTGGGG CGGGGGTCGG GGTCCTCCGG CCGTCCTCGC 

1001 1050 
PSP1-2 CGCCGTCCCT AGCCCGCCGC CCGCTTCTCC CCGGAGTCAG TACAACTTCA 
PSPI-1 CGCCGTCCCT AGCCCGCCGC CCGCTTCTCC CCGGAGTCAG TACAACTTCA 
PS PI -3 CGCCGTCCCT AGCCCGCCGC CCGCTTCTCC CCGGAGTCAG TACAACTTCA 
PSPl-4 CGCCGTCCCT AGCCCGCCGC CCGCTTCTCC CCGGAGTCAG TACAACTTCA 

1051 . 1100 

PSP1-2 TCGCAGATGT GGTGGAGAAG ACAGCACCTG CCGTGGTCTA TATCGAGATC 
PSPI-1 TCGCAGATGT GGTGGAGAAG ACAGCACCTG CCGTGGTCTA TATCGAGATC 
PSP1-3 TCGCAGATGT GGTGGAGAAG ACAGCACCTG CCGTGGTCTA TATCGAGATC 
PSPl-4 TCGCAGATGT GGTGGAGAAG ACAGCACCTG CCGTGGTCTA TATCGAGATC 

1101 1150 
PSP1-2 CTGGACCGGC ACCCTTTCTT GGGCCGCGAG GTCCCTATCT CGAACGGCTC 
PSPI-1 CTGGACCGGC ACCCTTTCTT GGGCCGCGAG GTCCCTATCT CGAACGGCTC 
PSP1-3 CTGGACCGGC ACCCTTTCTT ^GGGCCGCGAG GTCCCTATCT CGAACGGCTC 
PSPl-4 CTGGACCGGC ACCCTTTCTT GGGCCGCGAG GTCCCTATCT CGAACGGCTC 

1151 1200 
PSP1-2 AGGATTCGTG GTGGCTGCCG ATGGGCTCAT TGTCACCAAC GCCCATGTGG 
PSPl-i AGGATTCGTG GTGGCTGCCG ATGGGCTCAT TGTCACCAAC GCCCATGTGG 
PS-PI -3- AGGATTCGTG GTGGCTGCCG ATGGGCTCAT TGTCACCAAC-GCCCATGTGG 
PSP1-4 AGGATTCGTG GTGGCTGCCG ATGGGCTCAT TGTCACCAAC GCCCATGTGG 




(51) ¥fffl¥-l 0-117789 

me] 



















1201 








1250 


PSP1- 


2 


TGGCTGATCG 


GCGCAGAGTC 


CGTGTGAGAC 


TGCTAAGCGG 


CGACACGTAT 


PSP1- 


1 


TGGCTGATCG 






TGCTAAGCGG 


CGACACGTAT 




3 


TGGCTGATCG 




Our lul ununw 


TGCTAAGCGG 


CGACACGTAT 




4 


TGGCTGATCG 


u\*uCHvoHw 1 U 




TGCTAAGCGG 


CGACACGTAT 






1251 








1300 


PSP1- 


2 


GAGGCCGTGG 


TCACAGCTGT 


GGATCCCGTG 


GCAGACATCG 


CAACGCTGAG 


PSP1- 


-1 


GAGGCCGTGG 






GCAGACATCG 


CAACGCTGAG 


PSP1- 


3 


GAGGCCGTGG 


TCACAGCTGT 


GGAT cccgt g 


GCAGACATCG 


CAACGCTGAG 




4 


GAGGCCGTGG 


TCACAGCTGT 


GGATCCCGTG 


GCAGACATCG 


CAACGCTGAG 






1301 








1350 


PSP1- 


2 


GATTCAGACT 


AAGGAGCCTC 


TCCCCACGCT 


GCCTCTGGGA 


CGCTCAGCTG 


PSP1- 


1 


GATTCAGACT 




Tcrrr a eric? 


GCCTCTGGGA CGCTCAGCTG 


PSP1- 


.3 


GATTCAGACT 






GCCTCTGGGA 


CGCTCAGCTG 


PSP1- 


4 


GATTCAGACT 

i 






GCCTCTGGGA 


CGCTCAGCTG 






1351 








1400 


PSP1- 


2 


ATGTCCGGCA 


AGGGGAGTTT 


GTTGTTGCCA 


TGGGAAGTCC 


CTTTGCACTG 


PSP1- 


■1 


ATGTCCGGCA 


AGGGGAG T T T 


GTTGTTGCCA 


TGGGAAGTCC 


CTTTGCACTG 


PSPl- 


-3 


ATGTCCGGCA 


AGGGGAGTTT 


GTTGTTGCCA 

W <l A W A X W^^*» 


TGGGAAGTCC 


CTTTGCACTG 


PSP1- 


4 


ATGTCCGGCA 


AGGGGAGTTT 


GT TGTTGCCA 

w 4 * w* X w w w*\ 


TGGGAAGTCC 


CTTTGCACTG 






1401 








1450 


PSP1- 


2 


CAGAACACGA 


TCACATCCGG 


CATTGTTAGC 


TCTGCTCAGC 


GTCCAGCCAG 


PSP1- 


•1 


CAGAACACGA 


TCACATCCGG 


CATTGTTAGC 


TCTGCTCAGC 


GTCCAGCCAG 


PSP1- 


•3 


CAGAACACGA 


TCACATCCGG 


CATTGTTAGC 


TCTGCXCAGC 


GTCCAGCCAG 


PSP1- 


•4 


CAGAACACGA 


TCACATCCGG 


CATTGTTAGC 


TCTGCTCAGC 


GTCCAGCCAG 






1451 








. 1500 


PSP1- 


2 


AGACCTGGGA 


CTCCCCCAAA 


CCAATGTGGA 


ATACATTCAA ACTGATGCAG 


PSP1- 


■1 


AGACCTGGGA 


CTCCCCCAAA 


CCAATGTGGA 


ATACATTCAA ACTGATGCAG 


Ts'pT- 


3 


AGACCTGGGA 


CTCCCCCAAA 


CCAATGTGGA 


ATACATTCAA ACTGATGCAG 


PSP1- 


4 


AGACCTGGGA 


CTCCCCCAAA 


CCAATGTGGA 


ATACATTCAA ACTGATGCAG 



(52) 



* 

0-117789 



[07] 

1501 1550 

PSP1-2 CTATTGATTT TGGAAACTCT GGAGGTCCCC TGGTTAACCT 

PSP1-1 CTATTGATTT TGGAAACTCT GGAGGTCCCC TGGTTAACCT 

PSP1-3 CTATTGATTT TGGAAACTCT GGAGGTCCCC TGGTTAACCT GGTGAGTGAG 
PS PI- 4 CTATTGATTT TGGAAACTCT GGAGGTCCCC TGGTTAACCT 

1551 1600 

PSP1-2 

PSP1-1 

PS PI- 3 ACATCCTTCC TTCCAAGAAT CCCTGCCCCA GGTCAGTGTG GGAAGGGTAG 

PSPl-4 

1601 1650 

PSP1-2 

PSP1-1 

PSP1-3 GTTTCCCCTA ATTCAAGGAT GTTTGGTCAA GTTTCTGAGC AGTTCTTTGT 

PSPl-4 - 

1651 1700 

PSP1-2 % 

PSP1-1 - 

PSP1-3 TGGCTATCTC TCAATATCCA ACCAGATCTC CCCAACACTT GCTGGTACTT 

PSPl-4 

1701 a- 750 

PSPl-2 . . . • 

PSP1-1 

PSP1-3 TTGTTCGGGT GCCCCCATCC CCTACTATTT GTT TAGG CTA GGGAACTGGG 
PSPl-4 . GGCTA GGGAACTGGG 

1751 1800 

p S pi-2 GGATG GGGAGGTGAT TGGAGTGAAC ACCATGAAGG 

PSPl-l" . .GGATG GGGAGGTGAT TGGAGTGAAC ACCATGAAGG 

PS PI -3 GGCTGTATCC CTGCAGGATG GGGAGGTGAT TGGAGTGAAC ACCATGAAGG 
PSPl-4 GGCTGTATCC CTGCAGGATG GGGAGGTGAT TGGAGTGAAC ACCATGAAGG 



0-1 17789 



[MS] 

1801 1850 

PSPl-2 TCACAGCTGG AATCTCCTTT GCCATCCCTT CTGATCGTCT TCGAGAGTTT 

PSPl-1 TCACAGCTGG AATCTCCTTT GCCATCCCTT CTGATCGTCT TCGAGAGTTT 

PSP1-3 TCACAGCTGG AATCTCCTTT GCCATCCCTT CTGATCGTCT TCGAGAGTTT 

PSPl-4 TCACAGCTGG AATCTCCTTT GCCATCCCTT CTGATCGTCT TCGAGAGTTT 

1851 . 1900 

PSP1-2 CTGCATCGTG GGGAAAAGAA GAATTCCTCC TCCGGAATCA GTGGGTCCCA 
PSPl-1 CTGCATCGTG GGGAAAAGAA GAATTCCTCC TCCGGAATCA GTGGGTCCCA 
PSPl-3 CTGCATCGTG GGGAAAAGAA GAATTCCTCC TCCGGAATCA GTGGGTCCCA 
PSPl-4 CTGCATCGTG GGGAAAAGAA GAATTCCTCC TCCGGAATCA GTGGGTCCCA 

1901 1950 
PSP1-2 GCGGCGCTAC ATTGGGGTGA TGATGCTGAC CCTGAGTCCC AGCATCCTTG 
PSPl-1 GCGGCGCTAC ATTGGGGTGA TGATGCTGAC CCTGAGTCCC AGCATCCTTG 
PSPl-3 GCGGCGCTAC ATTGGGGTGA TGATGCTGAC CCTGAGTCCC AGCATCCTTG 
PS P It- 4 GCGGCGCTAC ATTGGGGTGA TGATGCTGAC CCTGAGTCCC A 

■ 1951 2000 

PSPl-2 CTGAACTACA GCTTCGAGAA CCAAGCTTTC CCGATGTTCA GCATGGTGTA 

PSPl-1 'CTGAACTACA GCTTCGAGAA CCAAGCTTTC CCGATGTTCA GCATGGTGTA 

PSPl-3 CTGAACTACA GCTTCGAGAA CCAAGCTTTC CCGATGTTCA GCATGGTGTA 



PSPl-4 



2001 2050 
PSP1-2 CTCATCCATA AAGTCATCCT GGGCTCCCCT GCACACCGGG CTGGTCTGCG 
PSPl-1 CTCATCCATA AAGTCATCCT GGGCTCCCCT GCACACCGGG CTGGTCTGCG 
PSPl-3 CTCATCCATA AAGTCATCCT ""GGGCTCCCCT GCACACCGGG CTGGTCTGCG 
PSPl-4 * - . • -GGG CTGGTCTGCG 

2051 2100 
PSP1-2 GCCTGGTGAT GTGATTTTGG CCATTGGGGA GCAGATGGTA CAAAATGCTG 
PSPl-1 GCCTGGTGAT GTGATTTTGG CCATTGGGGA GCAGATGGTA CAAAATGCTG 
PSPl-3" GCCTGGTGAT GTGATTTTGG CCATTGGGGA GCAGATGGTA CAAAATGCTG 
PSPl-4 GCCTGGTGAT GTGATTTTGG CCATTGGGGA GCAGATGGTA CAAAATGCTG 
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2101 2150 
PSPl-2 AAGATGTTTA TGAAGCTGTT CGAACCCAAT CCCAGTTGGC AGTGCAGATC 
PSP1-1 AAGATGTTTA TGAAGCTGTT CGAACCCAAT CCCAGTTGGC AGTGCAGATC 
PSP1-3 AAGATGTTTA TGAAGCTGTT CGAACCCAAT CCCAGTTGGC AGTGCAGATC 
PSP1-4 AAGATGTTTA TGAAGCTGTT CGAACCCAAT CCCAGTTGGC AGTGCAGATC 

2151 2200 
PSP1-2 CGGCGGGGAC GAGAAACACT GACCTTATAT GTGACCCCTG AGGTCACAGA 
PSP1-1 CGGCGGGGAC GAGAAACACT GACCTTATAT GTGACCCCTG AGGTCACAGA 
PSP1-3 CGGCGGGGAC GAGAAACACT GACCTTATAT GTGACCCCTG AGGTCACAGA 
PS Pi -4 CGGCGGGGAC GAGAAACACT GACCTTATAT GTGACCCCTG AGGTCACAGA 

2201 2250 
PSP1-2 ATGAATAGAT CACCAAGAGT ATGAGGCTCC TGCTCTGATT TCCTCCTTGC 
PSP1-1 ATGAATAGAT CACCAAGAGT ATGAGGCTCC TGCTCTGATT TCCTCCTTGC 
PSP1-3 ATGAATAGAT CACCAAGAGT ATGAGGCTCC TGCTCTGATT TCCTCCTTGC . 
PSP1-4 ATGAATAGAT CACCAAGAGT ATGAGGCTCC TGCTCTGATT TCCTCCTTGC 

2251 2300 
PSP1-2 CTTTCTGGCT GAGGTTCTGA GGGCACCGAG ACAGAGGGTT AAATGAACCA. 
PSP1-1 CTTTCTGGCT GAGGTTCTGA GGGCACCGAG ACAGAGGGTT AAATGAACCA 
PSP1-3 CTTTCTGGCT GAGGTTCTGA GGGCACCGAG ACAGAGGGTT AAATGAACCA 
PSP1-4 CTTTCTGGCT GAGGTTCTGA GGGCACCGAG ACAGAGGGTT AAATGAACCA 

2301 2350 
PSP1-2 GTGGGGGCAG GTCCCTCCAA CCACCAGCAC TGACTCCTGG GCTCTGAAGA 
PSP1-1 GTGGGGGCAG GTCCCTCCAA CCACCAGCAC TGACTCCTGG GCTCTGAAGA 
PSP1-3 GTGGGGGCAG GTCCCTCCAA CCACCAGCAC TGACTCCTGG GCTCTGAAGA 
PSP1-4 GTGGGGGCAG GTCCCTCCAA CCACCAGCAC TGACTCCTGG GCTCTGAAGA 

2351 . 2400 

PSP1-2 ATCACAGAAA CACTTTTTAT ATAAAATAAA ATTATACCTA GCAACATAAA 
PSP1-1 ATCACAGAAA CACTTTTTAT ATAAAATAAA ATTATACCTA GCAACATAAA 
PSE1-3 — ATCACAGAAA C ACTTTTTAT AT AA AAT AAA ATTATACCTA~GCAACATATT— 
PSP1-4 ATCACAGAAA CACTTTTTAT ATAAAATAAA ATTATACCTA GCAAAAAAAA 
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2401 2450 

PSP1-2 AAAAAAAAAA AA 

PSPl-1 AAAAAAAAAA AA ; 

PS PI -3 ATAGTAAAAA ATGAGGTGGG AGGGCTGGAT CTTTTCCCCC ACCAAAAGGC 
PSP1-4 AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAA 

2451 . 2500 

PSP1-2 

PSPl-1 

PSP1-3 TAGAGGTAAA GCTGTATCCC CCrAAACTTA GGGGAGATAC TGGAGCTGAC 
PSP1-4 

2501 2550 

PSP1-2 ...7, 

PSPl-1 

PSP1-3 CATCCTGACC TCCTATTAAA GAAAATGAGC TGCTGAAAAA AAAAAAAAAA 
PSP1-4 

2551 
PSP1-2 . 
PSP1-1 . 
PSP1-3 A 
PSP1-4 . 
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. HUMAN SERINE PROTEASE 

The present invention relates to isolated human serine protease (PSP1) 
polynucleotides, their horaologs and isofbrms and polymorphic variants and their detection; 
5 to essentially pure PSP1 proteins; and to compositions and methods of producing and using 
PSP1 polynucleotides and proteins. 

Mutations in the presenilins (PS-1 and PS-2) account for -95% (75% and 20%, 
respectively) of all cases of early onset familial Alzheimer's disease (FAD). See R. 
Sherrington et a/., Nature 375, 754-760 (1995); E.I. Rogaev et al. 9 Nature 376. 775-778 
10 (1995); and E. Levy-Lahad et aL, Science 269. 973-977 (1995). The presenilins are highly 
homologous (67% identical), multi-membrane spanning proteins whose function is 
unknown. 

It has been demonstrated that the 46 kDa full-length PS-L protein is normally 
processed to 28 kDa and 18 kDa fragments; PS-2 has been reported to be similarly cleaved. 

15 See M. Mercken et aL, F£BS Letters 389. 297-303 (1 996). The predicted cleavage sfte(s) 
to account for fragments of this size would be in a region of the protein coded for by exon 8 
and exon 9. Exon 8 is a hot spot for mutations leading to FAD. Thus, this region of PS-1, 
and potentially the cleavage of PS- 1 in this region by a presenilinase protease, are 
important events in the functionality of the protein. A region of PS-1 spanning exons 8-11 

20 has been demonstrated in the present invention to specifically bind a protease, PSP1, whose 
activity against its endogenous substrates and/or ability to bind to PS-1 are important in the 
pathology of neurodegeneration associated with AD, frontal lobe dementia, cortical lewy 
body disease, dementia of parkinson's disease, acute and chronic phases of degeneration 
following stroke or head injury, neuronal degeneration found in motor neurone disease, 

25 AIDS dementia and chronic epileps. Thus, a need exists for provision of the nucleotide and 
amino acid sequences corresponding to PSP1, for modulators of PSP1 binding to PS-1,. 
and/or modulators of PSPl's proteolytic activity, for methods to identify such modulators 
and for reagents useful in such methods. 

Accordingly, one aspect of the present invention is an isolated polynucleotide 

3 0 encoding a biologically active PSP 1 polypeptide. 

Ariother-aspeewf^^ 

group consisting of: 
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(a) a polynucleotide encoding PSPI-I having the nucleotide sequence as set forth 
in SEQ ID NO: 24 from nucleotide 603 to 1979; and 

(b) a polynucleotide substantially similar to SEQ ID NO: 24. 

Another aspect of the invention is an isolated polynucleotide selected from the 
group consisting o£ 

5 (a) a polynucleotide encoding PSPl-2 having the nucleotide sequence as set forth 

in SEQ ID NO: 23 from nucleotide 603 to 1979; and 

(b) a polynucleotide substantially similar to SEQ ID NO: 23. 
Another aspect of the invention is an isolated polynucleotide selected from the 
group consisting of: 

(a) a polynucleotide encoding PSPl-3 having the nucleotide sequence as set forth 
10 in SEQ ID NO: 26 from nucleotide 603 to 1736; and 

(b) a polynucleotide substantially similar to SEQ ED NO: 26. 

Another aspect of the invention is an isolated polynucleotide selected from the 
group consisting of: 

(a) a polynucleotide encoding PSP1 -4 having Ac nucleotide sequence as set forth 
in SEQ ID NO: 28 from nucleotide 603 to 1913; and 

(b) a polynucleotide substantially similar to SEQ ID NO: 28. 

15 In a further aspect the invention provides any isolated polynucleotide as above 

defined wherein nucleotides 672 and 1435 are independently selected from C and T t 
hereinafter referred to as 'polymorphic variants'. 

Another aspect of the invention is the functional polypeptides encoded by the 
polynucleotides of the invention. 
20 ' Another aspect of the invention is an antisense oligonucleotide comprising a 

sequence which is capable of binding to the polynucleotides of the Invention or D87258. 

Another aspect of the invention is modulators of the polypeptides of the invention 
orofD8725B. 

Another aspect of the invention is a method for assaying a medium for the presence 
25 of a substance that modulates PSP1 or D87258 activity by affecting the binding of PSP1 or 
D87258 to cellular binding partners comprising the steps of; 

( a) providing a PSP I or D87258 protein having the amino acid sequence of 

PSPM, PSPl-2, PSPl-3 orPSPl-4or D87258, or a functional derivative or polymorphic 
variant thereof and a cellular binding partner or synthetic analog thereof; 
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(b) incubating with a test substance which is suspected of modulating 
PSP1 or D87258 activity under conditions which permit the formation of a PSPl or D87258 
protein/cellular binding partner complex; 

(c) assaying for the presence of the complex, free PSP 1 or D8725 8 protein 
5 or free cellular binding partner; and 

(d) comparing to a control to determine the effect of the substance. 
Another aspect of the Invention is a method for assaying a medium for the presence 

of a substance that modulates PSP1 or D87258 activity by inhibiting proteolytic activity on 
a cellular substrate comprising the steps of. 
10 (a) providing a PSP1 or D87258 protein having the amino acid sequence of 

PSPl- 1, PSP1-2. PSP1-3 or PSP1-4 or D87258, or a functional fragment or polymorphic 
variant thereof and a cellular substrate or synthetic analog thereof; 

(b) incubating with a test substance which is suspected of inhibiting PSP1 
or D87258 activity under conditions which permit the formation of a PSP1 

1 5 enzyme/substrate complex and subsequent cleavage of the substrate; 

(c) assaying for the presence of proteo lyrically cleaved substrate; and 

(d) comparing to a control to determine the effect of the substance. 
Another aspect of the invention is a method for assaying for the presence of a 

substance that modulates PSP1 orD87258 activity by direct binding to PSP1 orD87258 
20 protein comprising the steps of: 

(a) providing a Labelled PSP1 or D87258 protein having the amino acid 
sequence of PSP1-1 , PSPl-2, PSP1-3 or PSPI-4 or D87258 or a functional derivative or 
polymorphic variant thereof; 

(b) providing solid support-associated modulator candidates; 

2 5 (c) incubating a mixture of the labelled PSP1 or D8725 Z protein with the 

support-associated modulator candidates under conditions which can permit the formation 
of a PSPl protein/modulator candidate complex; 

(d) separating the solid support from free soluble labelled PSPl or D8725S 

protein; 

30 . . (e) assaying for the presence of solid support-associated labelled protein; 

(0 isolating the solid support coraplexed with labelled PSP 1 or D87258 

protein; and 

(g) identifying the modulator candidate. 

3 
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Another aspect of the invention is PSPI or D87258 protein modulating compounds 
identified by the methods of the inventioD. 

Another aspect of the in vcna'oa is a method for the treatment of a patient taving 
need to modulate PSPI or D8725S activity comprising administering to the patient a 
5 therapeutically effective amount of the modulating compounds of the invention. 

Another aspect of the invention is a method of diagnosing conditions associated 
with PSPI or D87258 protein deficiency which comprises: 

(a) isolating a polynucleotide sample from an individual; 

(b) assaying the polynucleotide sample and a polynucleotide of the invention 
1 0 encoding PSP 1 or D87258; and 

(c) comparing differences between the polynucleotide sample and the PSPt or 
D87258 polynucleotidc J wherein any differences indicate mutations in the PSPI or D87258 
sequence. 

Another aspect of the invention Is a method of treating conditions which are related 
15 to insufficient PSPI or D87258 protein function which comprises: 

(a) isolating cells from a patient deficient in PSPI or D87258 protein function; 

(b) altering the ceils by transfecting the polynucleotide of the invention or D87258 
into the cells wherein a PSP 1 or D87258 protein is expressed; and 

(c) introducing the cells back to the patient to alleviate the condition. 

2 0 Another aspect of the invention is a method of treating conditions which arc related 

to insufficient PSPI or D87258 protein function which comprises administering the 
polynucleotide of the invention to a patient deficient in PSPI protein function wherein a 
PSPI or DS7258 protein is expressed and alleviates the condition. 

Another aspect of the invention is an antibody immunoreactive with PSPI or 
25 D87258 or an Immunogen thereof. 

Another aspeot of the invention S a transgenic non-human animal capable of 
expressing in any cell thereof the polynucleotide of the invention. 

Another aspect of the invention is a method for determining the genetic 
predisposition to neurodegenerarion in a patient comprising detecting PSPI or D87258 
polymorphisms in a sample from a patient. Yet another aspect of the invention is 

"olated polynucleotide having the nucleotide sequence as set forth In SEQ.ID„RQ: 32, 33, 

.34.35, 36 37, 38, 39, or 40. 

Figure I is an amino acid sequence alignment of PSPI- 1 with E. colt htrA. 
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Figure 2 Is a multiple cDNA sequence alignment of the PSPl isolates PSPI-], 
PSP1-2, PSP 1-3 and PSPJ-t. 

Figure 3 is an amino acid sequence alignment of PSP 1-1 with a putative human 
serine protease. 

5 As used herein, the term "PSPl polynucleotide" or "PSPP* refers to DNA 

molecules comprising a nucleotide sequence that encodes PSPl and alternative splice 
variants, Le., homologs and isoforms, and polymorphic variants. PSPl binds to a region 
encompassing amino acids 269-413 of the human PS-1 protein, contains a conserved serine 
protease motif and exhibits homology to the & coli serine protease htrA described by 
1 0 Upinska et al. in hfucl Acids Res. 16, 1 0053-10066* (1988) and a putative human serine 
protease with an IGF-binding motif (Ohno, I, et aL Gcnbank Accession No. D87258 
(1996)), hereinafter referred to as D87258. 

The PSPl -J sequence is listed in SEQ ED NO: 24. The coding region of this 
^sequence consists of nucleotides 603-1979 of SEQ ID NO: 24. The deduced 458 amino 
1 5 acid sequence of the encoded product PSP 1 - 1 is listed in SEQ ID NO: 25. 

The PSP1-1 sequence listed in SEQ ED NO: 30 includes two polymorphic variants, 
at nucleotides 672 (C/I) and 1435 (CAT) resulting in alternative amino acid residues at 
position 24 (ar^cys) and 278 (ala/val), both in the conserved region of nucleotides 1-1540. 
The deduced 458 amino acid sequence of the encoded product PSP hi is listed in SEQ ED 
20 NO: 31. 

The PSP 7-2 sequence is listed in SEQ ID NO: 23. The coding region of this 
sequence consists of nucleotides 603-1979 of SEQ ID NO: 23. The deduced 458 amino 
acid sequence of the encoded product PSP 1-2 is listed in SEQ ID NO: 8. The PSP J -3 
sequence is listed in SEQ ED NO: 26. The coding region of this sequence consists of 

25 nucleotides 603-1736 of SEQ ID NO: 26. The deduced 377 amino acid sequence of the 
encoded product PSP 1-3 is listed in SEQ ID NO: 27. Tht PSPJ-4 sequence is listed in 
SEQ ID NO: 28. The coding region of this sequence consists of nucleotides 603- 19 13 of 
SEQ ID NO: 28. The deduced 436 amino acid sequence of the encoded product PSP 1-4 is 
listed in SEQ ID NO: 29. 

30 The D87258 sequence is listed in SEQ ED NO: 17. The coding region of this 

sequence consists of nucleotides 49-1491 of SEQ ID NO: 17. The deduced 480 amino acid 

sequence of the encoded product D8 72 5 8 is listed in SEQ ID NO: 18. The-D87258 

sequence listed in SEQ ID NO: 17 includes a polymorphic variant at nucleotide 1325 (G/T) 
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resulting in alternative amino acid residues at position 213 (gly/val). The sequence in 
Genbank Accession No. D87258 (1996)), describe* only 1325G. The novel polynucleotide 
polymorph of D87258 having 1325T, is hereinafter referred to as D87258 (\32ST) and the 
novel encoded product having valine at 2 13 is D87258 (1 325T) protein, The novel 
5 polynucleotide D87258 (1325T)and its encoded protein can replace PSP-I in any of the 
composition, uses or methods herein described and.such novel polypeptide, encoded 
protein, compositions, uses and methods also form part of the invention. 

As used herein, the term "functional fragments" when used to modify a specific 
gene or gene product means a less than full length portion of the gene or gene product 
1 0 which retains substantially all of the biological function associated with the full length gene 
or gene product to which it relates. An example of a functional fragment of PSP1 is the 
minimal .catalytic domain. To determine whether a fragment of a particular gene or gene 
product is a functional fragment, fragments are generated by well-known nucleolytic or 
proteolytic techniques or by the polymerase chain reaction and the fragments tested for the 
1 5 described biological function. 

As used herein, an "antigen" refers to a molecule containing one or more epitopes 
that will stimulate a hosfs Immune system to make a humoral and/or cellular antigen- 
specific response. The term is also used herein interchangeably with "lmmunogen." 

As used herein, the term "epitope" refers to the site on an antigen or hapten to 
2 0 which a specific antibody molecule binds. The term is also used herein interchangeably 
with "antigenic determinant" or "antigenic determinant site." 

As used herein, "monoclonal antibody" is understood to include antibodies derived 
from one species (e.g, murine, rabbit, goat, rat, human, etc.) as well as antibodies derived 
from two (or perhaps more) species (e.g., chimeric and humanized antibodies). 
25 . As used herein, a coding sequence is "operably linked to" another coding sequence 

when RNA polymerase will transcribe the iWcoding sequences into a single mRNA, 
which is then translated into a single polypeptide having amino acids derived from both 
coding sequences. The coding sequences need not be contiguous to one another so long as 
the expressed sequence is ultimately processed to produce the desired protein. 
30 As used herein, "recombinant" polypeptides refer to polypeptides produced by 

.r^Qmbinant r^A.techniq 



construct encoding the desired polypeptide. "Synthetic" polypeptides are those prepared by 
chemical synthesis. 
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As used herein, a "replicon* 1 is any genetic clement (e.g., plasm id, chromosome, 
vims) that functions as an autonomous unit of DNA replication in vivo; i.e., capable of 
replication under its own control. 

As used herein, a "vector" is a replicon, such as a plasmid, phage, or cosraid, to 
5 which another DNA segment may be attached so as to bring about the replication of the 
attached segment. 

As used herein, a "reference" gene refers to the wild type PSP1 sequence of the 
invention and is understood to include the various sequence polymorphisms that exist, 
wherein nucleotide substitutions in the gene sequence exist, but do not affect the essential 
10 function of the gene product. 

As used herein, a "mutant" gene refers to PSPl sequences different from the 
reference gene wherein nucleotide substitutions and/or deletions anchor insertions result in 
perturbation of the essentia! function of the gene product 

As used herein, a DNA "coding sequence of or a "nucleotide sequence encoding" a 
15 particular protein, is a DNA sequence which is transcribed and translated into a polypeptide 
when placed under the control of appropriate regulatory sequences. 

As used herein, a "promoter sequence" is a DNA regulatory region capable of 
binding RNA polymerase in a cell and initiating transcription of. a downstream (3* direction) 
coding sequence. For purposes of defining the present invention, the promoter sequence is 
20 bound at its 3' terminus by a translation start codon (e.g., ATG) of a coding sequence and 
extends upstream (5* direction) to include the minimum number of bases or elements 
necessary to initiate transcription at levels detectable above background. Within the 
promoter sequence will be found a transcription initiation site (conveniently defined by 
mapping with nuclease SI), as well as protein binding domains (consensus sequences) 
2 5 responsible for the binding of RNA polymerase. Eukaryotlc promoters will often, but not 
always, contain "TATA" boxes and "CAT" boxes. Prokaryotic promoters contfidn Shine- 
Dalgarno sequences in addition to the -10 and -35 consensus sequences. 

As used herein, DNA "control sequences" refers collectively to promoter sequences, 
ribosome binding sites, polyadenylation signals, transcription termination sequences, 
30 upstream regulatory domains, enhancers and the like, which collectively provide for the 
expression (i.e., the transcription and translation) of a coding sequence in a host cell.. 

As used herein, a control sequence "directs the expression 11 of a coding sequence in 
a cell when RNA polymerase will bind the promoter sequence and transcribe the coding 
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sequence into mRNA, which is then translated into the polypeptide encoded by the coding 
sequence. 

As used herein, a "host cell" is a cell which has .been transformed or transfected, or 
is capable of transformation or transfection by an exogenous DNA sequence. 
5 As used herein, a cell has been "transformed" by exogenous DNA when such 

exogenous DNA has been introduced inside the cell membrane. Exogenous DNA may or 
may not be Integrated (covalently linked) into chromosomal DNA making up the genome of 
the cell. In prokaryotes and yeasts, for example, the exogenous DNA may be maintained on 
an episomal element, such as a plasraid. With respect to eukaryotic cells, a stably 
1 0 transformed or transfected cell is one in which the exogenous DNA has become integrated 
into the chromosome so that it is Inherited by daughter cells through chromosome 
replication. This stability is demonstrated by the ability of the. eukaryotic cell Co establish 
ceil lines or clones comprised of a papulation of daughter cells containing the exogenous 
DNA. 

15 As used herein, "transfection" or "tran3fcctcd ,, refers to a process by which cells 

take up foreign DNA and integrate that foreign DNA into their chromosome. Transfection 
can be accomplished, for example, by various techniques in which cells take up DNA (e.g., 
oaloium phosphate precipitation, electroporation, assimilation of liposomes, etc.) or by 
infection, in which viruses are used to transfer DNA into cells. 
20 As used herein, a "target cell" is a cell that is selectively transfected over other 

cell types (or cell lines). 

As used herein, a M clone" is a population of cells derived from a single cell or 
common ancestor by mitosis. A "cell line" is a clone of a primary cell that is capable of 
stable growth in vitro for many generations. 
25 As used herein, a "heterologous" region of a DNA construct is an identifiable 

segment of DNA within or attached to another DNA molecule that is not found in 
association with the other molecule in nature. Thus, when the heterologous region encodes 
a gene, the gene will usually be flanked by DNA that does not flank the gene in the genome 
of the source animal. Another example of a heterologous coding sequence is a construct 
30 where the coding sequence itself is not found in nature (e.g., synthetic sequences having 
codons different from the native gerie). Allelic variation or naturally occurring mutational 
events do not give rise to a heterologous region of DNA, as used herein. 
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As used herein, a "modulator" of a polypeptide is a substance which can affect the 
polypeptide function, such as au inhibitor of enzymatic activity. 

An aspect of the present invention is isolated polynucleotides encoding a PSP1 
protein and substantially similar sequences. Isolated polynucleotide sequences are 
5 substantially similar if they are capable of hybridizing under moderately stringent 

conditions to SEQ ID NOs: 23, 24, 26 or 28 or they encode DNA sequences which are 
degenerate to SEQ ID NOs: 23, 24, 26 or 28 or are degenerate to those sequences capable of 
hybridizing under moderately stringent conditions to SEQ ID NOs: 23, 24, 26 or 28. 

Moderately stringent conditions is a term understood by the skilled artisan and has 
10 been described in, for example, Sambrook et ai Molecular Cloning; A Laboratory Manual, 
2nd edition, Vol. 1, pp. 101-104, Cold Spring Harbor Laboratory Press (1989). An 
exemplary hybridization protocol using moderately stringent conditions is as follows. 
Nitrocellulose filters arc prehybridized at 65°C in a solution containing 6X SSPE, 5X 
Denhardfs solution (lOg Ficoll, lOg BSA and lOg polyvinylpyrrolidone per liter solution), 
15 0.05% SDS and 100 ug/ml tRNA. Hybridization probes arc labeled, preferably 

radiolabclled (e.g., using the Bios TAG-IT® kit). Hybridization is then carried out for 
approximately 1 8 hours at 65°C. The filters are then washed twice in a solution of 2X SSC 
end 0.5 % SDS at room temperature for 15 mlnures. Subsequently, the filters are washed at 
S8°C, air-dried and exposed to X-ray film overnight at -70°C with an intcasifying screen. 
2 0 Degenerate DNA sequences encode the same amino acid sequence as SEQ ID NOs; 

8, 25, 27 or 29 or the proteins encoded by that sequence capable of hybridizing under 
moderately stringent conditions to SEQ ID NOs: 8, 25, 27, 29, but have variation^) in the 
nucleotide coding sequences because of the degeneracy of the genetic code. For example, 
the degenerate codons UUC and UUU both code for the amino acid phenylalanine, whereas 
25 the four codons GGX. where X = U, C, A, or G, all code for glycine. 

Alternatively, substantially similar sequences are defined as those nucleotide 
sequences encoding proteins having ESP 1 activity in which about 70%, preferably about 
80%, and most preferably about 90%, of the nucleotides share identity with PSPI, i.e., a 
sequence encoding a protein having PSPI activity is substantially similar to any of SEQ ID 
30 NOs: 23, 24, 26 or 28 when at least about 70% of all of the nucleotides of the sequence 

match SEQ ID NOs: 23, 24, 26 or 28. Nucleotide sequences that are substantially similar 
can be identified by hybridization or by sequence comparison. 
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Embodiments of the isolated polynucleotides of the invention include DNA, 
genomic DNA and RNA, preferably of human origin. A method for isolating a nucleic acid 
. molecule encoding a PSP 1 protein is to probe a genomic or cDN A library with a natural or 
artificially designed probe using art recogni2ed procedures. See, e.g., "Current Protocols in 
5 Molecular Biology", Ausubel et at. (eds.) Greene Publishing Association and John Wiley 
Intersciencc, New York, 1989, 1992. The ordinarily skilled artisan will appreciate that SEQ 
ID NOs: 23, 24, 26 or 28 or fragments thereof comprising at least 1 5 contiguous 
nucleotides are particularly useful probes. Et is also appreciated that such probes cau be and 
arc preferably labeled with an analytically detectable reagent to facilitate identification of 
10 the probe. Useful reagents include, but arc not limited to, radioisotopes, fluorescent dyes or 
enzymes capable of catalyzing the formation of a detectable product. The probes would 
enable the ordinarily skilled artisan to isolate complementary copies of genomic DNA, 
cDNA or RNA polynucleotides encoding PSP1 proteins from human, mammalian or other 
animal sources or to screen such sources for related sequences, e.g., additional members of 
1 5 the family, type and/or subtype, including transcriptional regulatory and control elements as 
well as other stability, processing, translation and tissue specificity-determiriing regions 
from 5' and/or 3* regions relative to the coding sequences disclosed herein, all without 
undue experimentation. 

Another aspect of the invention is functional polypeptides encoded by the 
2 0 polynucleotides of the invention and substantially similar polypeptides. An embodiment of 
a functional polypeptide of the invention is the PSPl protein having the amino acid 
sequence set forth in SEQ ID NO: 8, 25, 27 or 29. 

Polypeptide sequences that are substantially similar are those sequences having 
PSPl activity in which about 50%, preferably 70%, and most preferably about 90%, of the 
2 5 amino acids share identity with PSP l , i.e., a sequence representing a polypeptide having 
PSPl activity is substantially similar to any of SEQ ID NOs: 8, 24, 26 or 28 when at least 
about 50% of all of the amino acids of the sequence match SEQ ID NOs: 8, 25, 27 or 29. 
Substantially similnr polypeptide sequences can be identified by techniques such as 
proteolytic digestion, gel electrophoresis, microscquencing and/or sequence comparison, 
30 e.g., through use of the GAP algorithm available from the University of Wisconsin Genetics 
Computer Group. 

Another aspect of the invention is a method for preparing essentially pure PSPl 
protein. Yet another aspect is the PSPl protein produced by the preparation method of the 
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invention. This protein has the amino acid sequence listed in SEQ ID NOa: 8, 25, 27 or 29 
and includes variants with a substantially similar amino acid sequence that have the same 
function. The proteins of this invention are preferably made by recombinant genetic 
engineering techniques by culturing a recombinant host cell containing a vector encoding 
5 the polynucleotides of the invention under conditions promoting the expression of the 
protein and recovery thereof. 

The isolated polynucleotides, particularly the DNAs, can be introduced into 
expression vectors by operatively linking the DNA to the necessary expression control 
regions, e.g., regulatory regions, required for gene expression. The vectors can be 

1 0 introduced into an appropriate host cell such as a prokaryotic, e.g., bacterial, or eukaryotic, 
e.g., yeast or mammalian cell by methods well known in the art See Ausubel et al. t supra. 
The coding sequences for the desired proteins, having been prepared or isolated, can be 
cloned into any suitable vector or replicon. Numerous cloning vectors are known to those 
of skill in the art and the selection of an appropriate cloning vector is a matter of choice. 

15 Examples of recombinant DNA vectors for cloning and host cells which they can transform 
include, but are not limited to, the bacteriophage (231 coif), pBR322 (E. co!f)» pACYC177 
(£. call), pKT230 (gram-negative bacteria), pGVl 106 (gram- negative bacteria), pLAFRl 
(gram-negative bacteria), pME290 (non-£ coli gram-negative bacteria), pHV14 (E. coli and 
Bacillus subtilis), pBD9 (Bacr7/ur), pU6\ (Strtpto/nyces), pUC6 (Streptomyces), YIpf 

2 0 (Saccharomyces), a baculovirus insect cell system, a Drosophila insect system, YCp 19 

(Saecharomyces) and pSV2neo (mammalian cells). See generally, "DNA Cloning*: Vols. 1 
& D, Glover et at. ed. IRL Press Oxford (1985) (1987); and T. Maniatis et al ("Molecular 
Cloning' 1 Cold Spring Harbor Laboratory (1982). 

The gene can be placed under the control of control elements such as a promoter, 

25 ribosome binding site (for bacterial expression) and, optionally, an operator, so that the 
DNA sequence encoding the desired protein is transcribed into RNA in the host cell 
transformed by a vector containing the expression construct. The coding sequence may or 
may not contain a signal peptide or leader sequence. The proteins of the present invention 
can be expressed using, for example, the E coli tac promoter or the protein A gene Qspa) 

3 0 promoter and signal sequence. Leader sequences can be removed by the bacterial host in 

post-translationai processing. See, e.g., U.S. Patent Nos. 4,43 1,739; 4,425,437 and 

4,338,397. 
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In addition to coatrol sequences, it may be desirable to add regulatory sequences 
which allow for regulation of the expression of the protein sequences relative to the growth 
of the host cell. Regulatory sequences are known to those of skill in the art. Exemplary arc 
those which cause the expression of a gene to be turned on or off in response Co a chemical 
5 or physical stimulus, including the presence of a regulatory compound or to various 
temperature or metabolic conditions. Other types of regulatory elements may also be 
present in the vector, for example, enhancer sequences. 

An expression vector is constructed so that the particular coding sequence is located 
in the vector with the appropriate regulatory sequences, the positioning and orientation of 
10 the coding sequence with respect to the control sequences being such that the coding 

sequence is transcribed under the "control" of the control sequences, Le., RNA polymerase 
which binds to the DNA molecule at the control sequences transcribes the coding sequence. 
. . Modification of the sequences encoding the particular antigen of interest may be desirable 
to achieve this end. For example, in some cases it may be necessary to modify the sequence 
15 so that it may be attached to the control sequences with the appropriate orientation; i.e., to 
maintain the reading frame. The control sequences and other regulatory sequences may be 
Iigated to the coding sequence prior to insertion into a vector, such as the cloning vectors 
described above. Alternatively, the coding sequence can be cloned directly into an 
expression vector which already contains the control sequences and an appropriate 
20 restriction site. 

In some cases, it may be desirable to produce mutants or analogues of PSP1 protein. 
Mutants or analogues may be prepared by the deletion of a portion of the sequence encoding 
the protein, by insertion of a sequence, and/or by substitution of one or more nucleotides 
within the sequence. Techniques for modifying nucleotide sequences, such as site-directed 
2 5 mutagenesis, are well known to those skilled in the art See, e.g.,T. Maniatis et czL, supra; 
"DNA Cloning, w Vols. I and II, supra: and "Nucleic Acid Hybridization", supra. 

Depending on the expression system and host selected, the proteins of the present 
invention arc produced by growing host cells transformed by an expression vector described 
above under conditions whereby the protein of interest is expressed. Preferred mammalian 

~ 30 "~ cells include human embryonic kidney cells (293), monkey kidney ce 

cells, Chinese hamster ovary (CHO) cells, DrosophUa or murine L-cells. If the expression 
system secretes the protein into growth media, the protein can be purified directly from the 
media. If the protein is not secreted, it is isolated from cell iysates or recovered from the 
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cell membrane fraction. The selection of the appropriate growth conditions and recovery 
methods are within the skill of the art. 

An alternative method to identify proteins of the present invention is by 
constructing gene libraries, using the resulting clones to transform E. coli and pooling and 
5 screening individual colonies using polyclonal serum or monoclonal antibodies to PSPI. 

The proteins of the present invention may also be produced by chemical synthesis 
such as solid phase peptido synthesis on an automated peptide synthesizer, using known 
amino acid sequences or amino acid sequences derived from the DMA sequence of the 
genes of interest. Such methods are known to those skilled in the art. 

1 0 The proteins of the present invention or their immunogenic fragments comprising at 

least one epitope can be used to produce antibodies, both polyclonal and monoclonal, 
directed.to epitopes corresponding to amino acid sequences disclosed herein. If polyclonal 
antibodies are desired, a selected mammal such as a mouse, rabbit, goat or horse is 
immunized with a protein of the present invention, or its fragment, or a mutant protein. 

1 5 Serum from the immunized animal is collected and treated according to known procedures. 
Serum polyclonal antibodies can be purified by immunoaffiniry chromatography or other 
known procedures. 

Monoclonal antibodies to the proteins of the present invention, and to the 
immunogenic fragments thereof, can also be readily produced by one skilled in the art. The 

2 0 general methodology for making monoclonal antibodies by using hybridoma technology is 
well known. Immortal antibody-producing cell lines can be created by cell fusion and also 
by other techniques such as direct transformation of B lymphocytes with oncogenic DNA or 
transfection with Epatein-Barr virus. See, e.g., M. Schreier et al. t "Hybridoma Techniques" 
(1930); Hammerling ttaL % "Monoclonal Antibodies and T-cell Hybridomas" (1981); 

25 Kennett ef aL % "Monoclonal Antibodies" (1980); and U.S. PatentNos. 4,341,761; 4.399,121; 
4,427,783; 4.444,887; 4,452,570; 4,466,917; 4,472,500; 4.491,632; and 4,493,890. Panels 
of monoclonal antibodies produced against the antigen of interest, or fragment thereof, can 
be screened for various properties, i.e„ for isotype, epitope, affinity, etc. Monoclonal 
antibodies are useful in purification, using immunoaffinity techniques, of the individual 

30 antigens which they are directed against. Alternatively, genes encoding the monoclonals of 
interest may be isolated firora the hybridomas by PCR techniques known in the art and 
cloned and expressed in the appropriate vectors. The antibodies of this invention, whether, 
polyclonal or monoclonal have additional utility in that they may be employed as reagents 
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in immunoassays, RIA, EUSA, and the like. The antibodies of the invention can be labeled 
with an analytically detectable reagent such as a radioisotope, fluorescent molecule or 
enzyme. 

Chimeric antibodies, in which non-human variable regions are joined or fused to 
5 human constant regions (see, e.g., Liu et al., Proc. Natl Acad. Sci USA, 84, 3439 (1987)), 
may also be used in assays or therapeutically. Preferably, a therapeutic monoclonal 
antibody would be "humanized" as described in Jones et al, Nature, 321, 522 ( 1986); 
Verhoeyen etal, Science, 239, 1534 (1988); Kabat et a/,,/ Immunol, 147, 1709 (1991); 
Queen et al, Proc. Natl Acad Set. USA, 86, 10029 (1989); Gorman et al, Proc. Natl Acad 
1 0 Set. USA t 88, 341 8 1 (1991); and Hodgson et al. Biotechnology, 421 (1991), 

Another aspect of the present invention is modulators of the polypeptides of the 
invention or of D87258. Functional modulation of PSPI or D87258 by a substance 
Includes partial to complete inhibition of function, such as inhibition of proteolytic activity, 
identical function, as well as enhancement of funcdon. Embodiments of modulators of the 
1 5 Invention include peptides, oligonucleotides and small organic molecules including 
peptidornirnen'es. Modulators of the invention may be useftil as therapeutics or 
prophylactics for all forms of neurodegeneration including AD. Modulators of PSPI or 
DS7258 proteolytic activity relative to other endogenous substrates may be also be useful 
for the treatment of other types of human disease states. 
2 0 Another aspect of the invention is antisense oligonucleotides comprising a 

sequence which is capable of binding to the polynucleotides of the invention. Synthetic 
oligonucleotides or related antisense chemical structural analogs can be designed to 
recognize, specifically bind to and prevent transcription of a target nucleic acid encoding 
PSPI or D87258 protein by those of ordinary skill in the art. See generally. Cohen, J.S., 
2 5 Trends in Pharm. ScL f 10, 435( 1989) and Weintr^ub, H.M., Scientific American, January 
(1990) at page 40. 

Another aspect of the invention is a method for assaying a medium for the presence 
of a substance that modulates PSP I or D87258 protein function by affecting the binding of 
PSPI or D37258 protein to cellular binding partners. Examples of modulators include, but 
30 are not limited lo peptides and small organic molecules including peptidornimerics. A 
PSPI or D87258 protein is provided having the amino acid sequence of PSPI (SEQ ID 
NOs: 8, 25, 27 or 29} or D87258 (SEQ ID NO; 1 8) or a functional fragment thereof 
together with a cellular binding partner or synthetic analog thereof. The mixture is 
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incubated with a test substance which is suspected of modulating PSP 1 or D87258 activity, 
under conditions which permit the formation of a PSPI or D87258 gene product/cellular 
binding partner complex. An assay is performed for the presence of the complex, free 
PSPI or D87258 protein or free cellular binding partner and the result compared to a 
5 control to detonninc the effect of the test substance. 

Another aspect of the invention is a method for assaying a medium for the presence 
of a substance that modulates PSPI or D87258 protein function by inhibiting its proteolytic 
activity on cellular substrates. Examples of modulators include, but are not limited to 
peptides and small organic molecules including peptidomimetics. Cellular substrates can 

10 include PS-1, PS-2, APP or other substrates. A PSPI or D87258 protein is provided having 
the amino acid sequence of PSP 1 (SEQ ID NOs; 8, 25, 27 or 29) or D87258 (SEQ ID NO: 
1 8) or a junctional fragment thereof together with a cellular substrate or synthetic analog 
thereof. The mixture is incubated with a test substance which is suspected of inhibiting 
PSPI or D87258 activity, under conditions which permit the formation of a PSPI or 

1 5 D87258 enzyme/substrate complex and subsequent cleavage of the substrate. 
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Another aspect of the invention is a method for assaying for the presence of a 
substance that modulates PSP1 orD8725S activity by direct binding to PSP I orD8725$ 
protein. Examples of modulators include, bat arc not limited to, peptides and small organic " 
molecules including peptidoraimetics. Modulator candidates are synthesized on a solid 
5 support by techniques such as those disclosed in Lam et al t Nature 354, 82 (1991) or 
' Burbaum et aL. Proc. Natl. Acad ScL USA 92, 6027 (1995) to provide solid support- 
associated modulator candidates. A labelled PSP1 orD87258 protein is provided having 
the amino acid sequence of PSP1 (SEQ ID NOs: 8, 25, 27 or 29) or D87258 (SEQ ID NO: 
1 8) or a functional derivative thereof. Exemplary labels include directly attached 
1 0 fluorescent or colored dyes, biotin, radioisotopes or epitope tags, which are detectable by a 
suitable antibody. A mixture of solid support-associated modulator candidates and labelled 
PSP1 orD87258 protein is incubated under conditions which can permit the formation of a 
PSP1 or D87258 protein/modulator candidate complex. The solid support is separated from 
free soluble labelled PSPl orD87258 protein. An assay is performed for the presence of 
1 5 solid support-associated labelled protein. Solid supports complcxed with labelled protein 
are isolated and the identity of the modulator candidate determined by techniques well 
known to those skilled in the art, such as the TOF-SIMS method in Brummel et aL, Science 
264, 399-402 (1994). 

Modulation of PSPl or D87258 function would be expected to have effects on 
20 presenilin cleavage, the cleavage of other proteins or 0A4 production. Any modulators so 
Identified would be expected to be useful as a therapeutic for the treatment and prevention 
of neurodegeaeration including FAD and AD. 

Further, PSPl or D87258 could be used to isolate proteins which interact with it 
and this interaction could be a target for interference, Inhibitors of protein-protein 
25 interactions becween PSPl or D87258 and other factors could lead to the development of 
pharmaceutical agents for the modulation of PSpl or D87258 activity. 

Methods to assay for protein-protein interactions, such as that of a PSPl or D87258 
gene product/binding partner complex, and to isolate proteins interacting with PSPl or 
D87258 are known to those skilled in the art- Use of the methods discussed below enable 
30 one of ordinary skill in the art to accomplish these aims without undue experimentation. 

The yeast two-hybrid system provides methods for detecting the interaction 
between a first test protein and a second test protein, in v/vo, using rcconstitution of the 
activity of a transcriptional activator. The method is disclosed in U.S. Patent No. 
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5,283,1 73; reagents are available from Clontech and Stratagene. Briefly, PSPI cDNA is 
fused to a Gal4 orLexA transcription factor DNA binding domain and expressed in yeast 
cells. cDNA library members obtained from cells of interest are fused to a transactivation 
domain of Gal4 or another transactivation domain. cDNA clones which express proteins 
5 which can interact with PSPI will lead to reconstitution of transcription factor activity 
such as Ga!4 and transactivation of a reporter gene expression such as GallAacZ. 

An alternative method is screening of Xgtl 1, XZAP (Straragene) or equivalent 
cDNA expression libraries with recombinant PSPI. Recombinant PSPI protein or 
fragments thereof are fused to small peptide tags such as FLAG, HSV or GST. The peptide 

10 tags can possess convenient phosphorylation sites for a kinase such as heart muscle creatine 
kinase or they can be biotinylated. Recombinant PSPI can be phosphorylated wiih 32 [P] or 
used unlabeled and detected with streptavidin or antibodies against the tags. Xgtl lcDNA 
expression libraries are made from cells of interest and are incubated with the recombinant 
PSPI, washed and cDNA clones isolated which interact with PSPI . See, eg., T. Maniatis 

15 etaL, supra. 

Another method is the screening of a mammalian expression library in which the 
cDNAs are cloned into a vector between a mammalian promoter and polyadenylation site 
and transiently transfected in COS or 293 cells followed by detection of the binding protein 
48 hours later by incubation of fixed and washed cells with a labelled PSPI, prefereably 

2 0 iodinated, and detection of bound PSPI by autoradiography (See Sims et al, Science 241 , 
585-589 (1988) and McMahan etaL EMBOJ. 70,2821-2832(1991)). In this manner, 
pools of cDNAs containing the cDNA encoding the binding protein of interest can be 
selected and the cDNA of interest can be isolated by further subdivision of each pool 
followed by cycles of transient tnuwfcction, binding and autoradiography. Alternatively, 

25 the cDNA of interest can be isolated by transfecting the entire cDNA library into 

mammalian cells and panning the cells on a dish containing PSPI bound to the plate. Cells 
which attach after washing are lyscd and the plasmid DNA isolated, amplified in bacteria, 
and the cycle of transfection and panning repeated until a single cDNA clone b obtained 
(Sec Seed et al, Proc. Natl Acad, Sci USA 84. 3365 ( 1987) and Aruffo et ai, EMBOJ. 6, 

j^q 334-3^4989))Hfthe-bindm^ 

pooling strategy once a binding or neutralizing assay has been established for assaying 
supernatants from transiently transfected cells. General methods for screening supernatants 
are disclosed in Wong et ai. Science 228. 810-815 (1985). 
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Another alternative method is isolation of proteins interacting with PSP1 directly 
from cells. Fusion proteins of PSP 1 with GST or small peptide tags are made and 
immobaixed on beads. Biosynthetlcally labeled or unlabeled protein extracts from the cells 
of interest are prepared, incubated with the beads and washed with buffer. Proteins 
5 interacting with PSP1 are eiuted specifically from die beads and analyzed by SDS-PAGE. 
Binding partner primary amino acid sequence data are obtained by microsequencing. 
Optionally, the ceils can be'treated with agents that induce a functional response such as 
tyrosine phosphorylation of cellular proteins. An example of such an agent would be a 
growth factor or cytokine such as erythropoietin or lnterleukin-3. 
1 0 , Another alternative method is immunoaffinity purification. Recombinant PSP 1 is 
incubated with labeled or unlabeled eel 1 extracts and immunoprecipitatcd with anti-PSP I 
antibodies. The immunoprecipitate is recovered with protein A-Sepharose and analyzed by 
SDS-PAGE. Unlabelled proteins are labeled by biotlnylation and detected on SDS gels 
with streptavidm. Binding partner proteins are analyzed by microsequencing. Further, 
1 5 standard biochemical purification steps known to those skilled in the art may be used prior 
to microsequencing. 

Yet another alternative method is screening of peptide libraries for binding 
partners. Recombinant tagged or labeled PSP 1 is used to select peptides from a peptide or 
phosphopeptide library which interact with PSPl. Sequencing of the peptides leads to 
20 identification of consensus peptide sequences which might be found in interacting proteins. 
PSPl or D87258 binding partners identified by any of these methods or other 
methods which would be known to those of ordinary skill in the art as well as those putative 
binding partners discussed above can be used in the assay method of the invention. 
Assaying for the presence of PSPl or D87258 /binding partner complex are accomplished 
2 5 by, for example, the yeast two-hybrid system, ELISA or immunoassays using antibodies 
specific for the complex. In the presence vf test substances which interrupt or inhibit 
formation of PSPl or D87258 /binding partner interaction, a decreased amount of complex 
will be determined relative to a control lacking the test substance. 

Assays for free PSP 1 or D87258, or binding partner arc accomplished by, for 
30 example, ELISA or immunoassay using specific antibodies or by incubation of radiolabeled 

PSPl or D87258 with cells or cell membranes followed by ccntrifugation or fi lter 

separation steps. In the presence of test substances which interrupt or inhibit formation of 
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PSPl or D87258 /binding partner interaction, an increased amount of free PSP I or D87258, 
or free binding partner will be determined relative to a control lacking the test substance. 

Another aspect of the invention is pharmaceutical compositions comprising an 
effective amount of a PSPl or D87258 modulator of the invention and a pharmaceutical^ 
acceptable carrier. Pharmaceutical compositions of modulators of this invention for 
parenteral administration, i.e., subcutaneous ly, intramuscularly or intravenously or oral 
administration can be prepared. 

The compositions for parenteral administration will commonly comprise a solution 
of the modulators of the invention or a cocktail thereof dissolved in an acceptable carrier, 
preferably an aqueous carrier. A variety of aqueous, carriers may be employed, e.g., water, 
bufFered water, 0.4% saline, 0.3% glycine and the like. These solutions are sterile and 
generally free of particulate matter. These solutions may be sterilized by conventional, 
well-known sterilization techniques. The compositions may contain pharmaceutically 
acceptable auxiliary substances as required to approximate physiological conditions such as 
pH adjusting and buffering agents, etc. The concentration of the modulator of the invention 
in such pharmaceutical formulation can vary widely, i.e., from less than about 0.5%, 
usually at or at least about 1% to as much as 15 or 20% by weight and will be selected 
primarily based on fluid volumes, viscosities, etc. according to the particular mode of 
administration selected. 

Thus, a pharmaceutical composition of the modulator of the invention for 
intramuscular injection could be prepared to contain I mL sterile buffered water, and 50 mg 
of a protein of the invention. Similarly, a pharmaceutical composition of the modulator of 
the invention for intravenous infusion could be made up to contain 250 mL of sterile 
Ringer's solution, and 1 50 rag of a modulator of the invention. Actual methods for 
preparing parenteral iy administrate compositions are well known or will be apparent to 
those skilled in the art and arc described in more detail in, for example, Remington's 
Pharmaceutical Scimce, 15th ed., Mack Publishing Company, Easton, Pennsylvania. 

The physician will determine the dosage of the present therapeutic agents which 
will be most suitable and it will vary with the form of administration and the particular 
compound chosen, and furthermore, it will vary with the particular patient under treatment. 
Generally, the physician will wish to initiate treatment with small dosages substantially less 
than the optimum dose of the compound and increase the dosage by small increments until 
the optimum effect under the circumstances is reached. It will generally be found that 
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when the composition is administered orally, larger quantities of the active agent will be 
required to produce the same effect as a smaller quantity given pareateraliy. The 
therapeutic dosage will generally be from 0.1 to 1000 milligrams per day and higher 
although it may be administered in several different dosage units. 
5 Depending on the patient condition, the pharmaceutical composition of the 

invention can be administered for prophylactic and/or therapeutic treatments. In 
therapeutic applications, compositions containing the present compounds or a cocktail 
thereof are administered to a patient already suffering from a disease in an amount 
sufficient to cure or at least partially arrest the disease and its complications. In 
1 0 prophylactic applications, compositions containing the present compounds or a cocktail 
thereof are administered to a patient not already in a disease state to enhance the patient's 
resistance to the disease. 

Single or multiple administrations of the pharmaceutical compositions can be 
carried out with dose levels and pattern being selected by the treating physician. In any 
1 5 event, the pharmaceutical composition of the invention should provide a quantity of the 
modulators of the invention sufficient to effectively treat the patient. 

Additionally, some diseases result from inherited defective genes. These genes can 
be detected by comparing the sequence of the defective gene with that of a normal one. 
Individuals carrying mutations in the PSP1 or D87258 gene may bo detected at the DNA 
2 0 level by a variety of techniques. Nucleic acids used for diagnosis (genomic DNA, mRNA, 
etc.) may be obtained from a patients cells, such as from blood, urine, saliva or tissue 
biopsy, e.g., chorionic villi sampling or removal of amniotic fluid cells and autopsy 
material. The genomic DNA may be used directly for detection or may be amplified 
enzymaticaily by using PCR, Iigase chain reaction (LCR), strand displacement 
25 amplification (SDA), etc. prior to analysis. See, e.g. ( Saiki et cl, Nature, 324. 163-166 
(1986), Bcj, et al. t CrtC Rev. Biochem. Molea Biol. 26, 301-334 (1991), Birkenmeyer et 
ai. t J. VlroL Meth, 3S, 1 17-126 (1991), Van Brunt, J., Bio/Technology. 8 t 291-294 (1990)). 
RNA or cDNA may also be used for the same purpose. As an example, PCR primers 
complementary to the nucleic acid of the instant invention can be used to identify and 
30 analyze PSPI or D87258 mutations. For example, deletions and insertions can be detected 
by a change in size of the amplified product in comparison to the norinal PSPlor D87258 
genotype. Point mutations can be identified by hybridizing amplified DNA to rabiolabelcd 
• PSPI or D87258 RNA of the invention or alternatively, radiolabeled PSPI or D87258 
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antisense DNA sequences of the invention. Perfectly matched sequences can be 
distinguished from mismatched duplexes by RNase A digestion or by differences in melting 
temperatures (Tm), Such a diagnostic would be particularly useful for prenatal and even 
neonatal testing. 

5 In addition, point mutations and other sequence differences between the reference 

gene and "mutant" genes can be identified by yet other well-known techniques, e.g., direct 
DNA sequencing, single-strand conformational polymorphism, See Orita et al., Genomics, 
J, 874-879 (1989). For example, a sequencing primer is used with double-stranded PCR 
product or a single-stranded template molecule generated by a modified PCR. The 

1 0 sequence determination is performed by conventional procedures with radiolabeled 

nucleotides or by automatic sequencing procedures with fluoresoent-tags. Cloned DNA 
segments may also be used as probes to detect specific DNA segments. The sensitivity of 
this method is greatly enhanced when combined with PCR- Further, point mutations and 
other sequence variations, such as polymorphisms, can be detected as described above, e.g ; , 

1 5 through the use of allele-specific oligonucleotides for PCR amplification of sequences that 
differ by single nucleotides. Oligonucleotides having sequences as set forth, in SEQID 
Nos: 32, 33, 34, 35, 36, 37, 38, 39 and 40 arc useful in such a method. These methods are 
useful for determining the genetic predisposition to neurodegeneration in a patient by 
detecting polymorphisms within PSPI or D87258 in a sample from a patient. Preferably, 

20 the polymorphisms detected are at nucleotide 672 of PSPI, at nucleotide 1435 of PSPI or at 
nucleotide 1325 of D87258. Preferably, the polymorphisms are detected by PCR; most 
preferably, the polymorphisms are detected by PCR with oligonucleotides having a 
nucleotide sequence selected from the group consisting of SEQ ED NOs: 32, 33, 34, 35, 36, 
37, 38, 39 and 40. Preferably, the neurodegeneration predisposition determined is to 

2 5 Alzheimer's disease. 

Genetic testing based on DNA sequence differences may be achieved by detection 
of alteration in elcctrophoretic mobility of DNA fragments in gels with or without 
denaturing agents. Small sequence deletions and insertions can be visualized by high 
resolution gel electrophoresis. DNA. fragments of different sequences may be distinguished 

30 on denaturing formamidc gradient gels in which the mobilities of different DNA fragments 
are retarded in the gel at different positions according to their specific melting or partial 
melting temperatures. See, e.g., Myers et ai, Science, 230, 1242 (1985). In addition, 
sequence alterations, in particular small deletions, may be detected as changes in the 
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migration pattern of DNA heteroduplexes in non-denaturing gel electrophoresis such as 
heteroduplex electrophoresis. See, e.g., Nagamine et al. t Am. J. Num. Genet.. 45, 337-339 
(1989). Sequence changes at specific locations may also be revealed by nuclease protection 
assays, such as RNase and SI protection or the chemical cleavage method as disclosed by 
5 Cotton et al in Proc. NatL Acad. ScL USA, 65, 4397-4401 (1985). 

Thus, the detection of a specific DNA sequence may be achieved by methods such 
as hybridization (e.g., heteroduplex electroporation, see, White etal., Genomics, J2, 30 1- 
306 (1992). RNAse protection (e.g., Myers etal, Science, 230, 1242 (1985)) chemical 
cleavage (e.g., Cotton era/., Proc Nafl, Acad Scl. USA. 85, 4397-4401 (1985)), direct 
10 DNA sequencing, or the use of restriction enzymes (e.g., restriction fragment length 

polymorphisms (RFLP) in which variations in the number and size of restriction fragments 
can Indicate insertions, deletions, presence of nucleotide repeats and any other mutation 
which creates or destroys an endanuclease restriction sequence). Southen blotting of 
— genomic DNA may also be used to identify Urge (i.e., greater than 100 base pair) deletions 
15 and insertions. 

In addition to conventional gel electrophoresis and DNA sequencing, mutations 
such as microdeietions, aneuploidies, translocations, inversions, can also be detected by i# 
situ analysis. See, eg, Keller et al. 9 DNA Probes, 2nd Ed., Stockton Tress, New York, 
N.Y, USA (1993). That Is, DNA orKNA sequences in cells can be analyzed for mutations 
2 0 without isolation andVor immobilization onto a membrane. Fluorescence in situ 

hybridization (FISH) is presently the most commonly applied method and numerous 
reviews of FrSH have appeared. See, e.g, Trachuck eta!., Science, 250, 559-562 (1990), 
andTrask era/., Trends, GeneL, 7, 149-154 (1991). Hence, by using nucleic acids based on 
the structure of the PSP1 or D87258 genes, one can develop diagnostic tests for genetic 
- 25 mutations. . 

In addition, some diseases are a result oF, or are characterized by, changes in gene 
expression which can be detected by changes in the mRNA. Alternatively, the PSP 1 or 
D87258 gene can be used as a reference to identify individuals expressing an increased or 
decreased level of PSP1 crD87258 mRNA, e.g., by Northern blotting or in situ 
30 hybridization. 

Defining appropriate hybridization conditions is within the skill of the art. See, 
e.g., "Current Protocols in Mol. Biol." Vol. I & II, Wiley Intcrsciencc. Ausbel et al. (eds.) 
( 1992). Probing technology Is well known in the art and it is appreciated that the size of the 
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probes can vary widely but it is preferred that the probe be at [east 1 5 nucleotides in length. 

It is also appreciated that such probes can be and are preferably labeled with an analytically 

detectable reagent to facilitate identification of the probe. Useful reagents include but are 

not limited to radioisotopes, fluorescent dyes or enzymes capable of catalyzing the 

5 formation of a detectable product. As a general rule, the more stringent the hybridization 

conditions the more closely related genes will be that are recovered 

The putative role of PSP.l or D87258 in presenilin biochemistry establishes yet 

another aspect of the invention which is gene therapy. "Gene therapy" means gene 

supplementation where an additional reference copy of a gene of interest is inserted into a 

1 0 patient's cells. As a result, the protein encoded by the reference gene corrects the defect 

and permits the cells to function normally, thus alleviating disease symptoms. The 

reference copy would be a wild-type form of the PSP1 or D87258 gene or a gene encoding 

a protein or peptide which modulates the activity of the endogenous PSP1 or D87258. 

Gene therapy of the present invention can occur e £n vivo or ex vivo. Ex vivo gene 

1 5 therapy requires the isolation and purification of patient cells, the introduction of a 
» 

therapeutic gene and introduction of the genetically altered cells back into the patient. A 
replication-deficient virus such as a modified retrovirus can be used to introduce the 
therapeutic PSP1 or D87 2 58 gene into such cells. For example, mouse Moloney. leukemia 
virus (MMLV) is a well-known vector in clinical gene therapy trials. See, e.g., Boris- 

20 Lauerie etaL, Curr. Opin. Genet. Dev., 3, 102-109 (1993). 

In contrast, in vrvo gene therapy docs not require isolation and purification of a 
patient's cells. The therapeutic gene is typically "packaged" for administration to a patient 
such as in liposomes or in a replication-deficient virus such as adenovirus as described by 
Berkner, K.L., in Curr. Top, Microbiol Immunol, 158, 39-66 (1992) or adeno-associated 

2 5 virus (AAV) vectors as described by Muryczka, N.. in Curr. Top. Microbiol Immunol, 

J58 t 97-129 (1992) and U.S. Patent No. £252,479. Another approach is administration of 
"naked DNA" in which the therapeutic gene is direcdy injected into the bloodstream or 
muscle tissue. Another approach is administratioa of "naked DNA" in which the 
therapeutic gene is introduced into the target tissue by microparticle bombardment using 

30 gold particles coated with the DNA. 

Cell types useful for gene therapy of the present invention include lymphocytes, 
hepatocytes, myoblasts, fibroblasts, any cell of thfl eye such as retinal cells, epithelial and 
endothelial cells. Preferably the cells are T lymphocytes drawn from the patient to be 
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treated, hepatocytes, any cell of the eye or respiratory or pulmonary epithelial cells. 
Transfection of pulmonary epithelial cells can occur via inhalation of a neubulized 
preparation of DKA vectors in liposomes, DNA-protcin complexes or replication-deficient 
adenoviruses. See, e.g., U.S. Patent No. 5,240,846. 
5 Another aspect of the invention is transgenic, non-human mammals capable of 

expressing the polynucleotides of the invention or D872S8 in any cell. Transgenic, non- 
human animals may be obtained by transacting appropriate fertilized eggs or embryos of a 
host with the polynucleotides of the invention, with D87258 or with mutant forms found in 
human diseases. Sec, e.g., U.S. Patent Nos. 4,736,866; 5, 175,385; 5,175,384 and 
10 5,1 75,386. The resultant transgenic animal may be used as & model for the study of PSPI 
orD87258 gene function. Particularly useful transgenic animals are those which display a 
detectable phenotypc associated with the expression of the PSPi or D87258 protein. Drug 
development candidates may then be screened for their ability to reverse or exacerbate the 
relevant phenotype. 

15 

The present invention will now be described with reference to the following 
specific, non- limiting examples. 

Example 1 - Identification of the PS-1 Binding Partner PSPI 
20 A portion of PS- 1 cDNA (GenBank Accession No. L42 1 10) (SEQ ID NO: 9) 

encoding residues 269-413 of the PS- 1 amino acid sequence (SEQ ID NO: 10) was PCR 
amplified with the oligonucleotide primers S'-CGGAATTCCGTATGCTGGTTGAAACA- 
3' (SEQ ID NO: 11) and S'-CGGGATCCTCAGGCTACGAAACAGGCTAT-j (SEQ ID 
NO: 12). The product was digested with JEcoRI and.ffa/wHI and cloned into pEG202 
2 5 (Golemis et aL 9 in Current Protocols in Molecular Biology. John Wiley & Sons, New York 
(1994)). The resulting plasmid. pCC352, encoded a fusion protein in which the DNA 
binding protein, Lex\ was fused in-frame to amino acids 269-413 of PS-1. The parent 
vector, pEG202, was a yeast expression vector which uses the alcohol dehydrogenase 
{AD HI) promoter to express the lexA fusion proteins and HIS3 as the selectable marker. 
30 Sequence analysis using an automated DNA sequencer (Applied Biosystems, Inc.) 

confirmed that the amplified region had the correct sequence and was fused in-frame to 
LexA. 
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All procedures, plasmids and strains used in the two-hybrid screen have been 
described in detail by Gotemis etal. t supra. Yeast.strain EGY4S (AtATa, trp} t his3 t ura3, 
6ops-LEU2) was cotransformed with the plasmids pCC352 and pSHlS-34. Transformants 
were selected using complete minimal media lacking uracil and histidine. The plasmid 
5 pSH 1 8-34 is a yeast expression vector in which eight LexA operator sices are located 

upstream of a minimal GALJ promoter which drives the expression of the LacZ gene and 
URA3 as a selectable marker. Synthesis of the full length LexA-PS- 1 fusion was confirmed 
by Western blot analysis of yeast extracts using polyclonal antisera directed against LexA. 
It was confirmed that the LcxA-PS-1 fusion alone was unable to activate neither the LEU2 
1 0 nor LacZ reporter strains. In addition, the ability of the LexA-PS- 1 fusion to enter the 
nucleus and bind DNA was confirmed using a repression assay. 

A strain containing the LexA-PS- 1 fusion and pSH 18-34 (CCY321) was 
transformed with a human fetal brain cDNA library (Clontech) in plasmid pJG4-5 using a 
library scale transformation protocol. This library plasmid contains the TRPJ selectable 
15 marker and allows the expression of cDNAs as fusions (AD fusions) to a cassette 
containing the SV40 nuclear localization sequence, the acid blob B42, and the 
hemagglutinin epitope tag. See Gyuris et a/., Cell 7J, 791-803 (1993). Expression of this 
fusion is under control of the galactose inducible promoter GALI. Transformation 
reactions were plated onto complete minimal media lacking uracil, histidine and 
20 tryptophan. Approximately 4.5 x 1 0*> individual transformants were obtained, pooled and 
frozen. To ensure that each primary colony was replatcd during the selection procedure, 2 
x 10 7 viable cells (approximately 3 times the number of individual transformants) were 
plated onto minimal media lacking uracil, histidine, tryptophan and leucine with 
galactose/raffinose as the carbon source to induce expression of AD fusions. Colonies 
25 arising after 3 and 4 days of growth at 30 °C were picked to complete minimal media 

lacking uracil, histidine and tryptophan. Colonies containing potential interacting fusion 
proteins were then tested for galactose dependence and LacZ expression. Those isolates 
which activated both the LEU2 and LacZ reporters in a galactose dependent fashion were 
considered positive and pursued further. Plasmids were isolated from yeast, used to 
30 transform K coli strain KC8, and AD fusion plasmids selected by growth on minimal E. 
cali media lacking tryptophan. Each AD fusion plasmid containing a potential interacting 
fusion was used to transform CCY321. Several transformants were subjected to screening 
for galactose dependent LEU2 and LacZ activation. To ensure that the interaction was 
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specific, the ability of each AD fusion plasm id to interact with 22 nonrelatsd LexA fusion 
proteins was tested. AD fusion plasmids which passed this second round of screening and 
interacted specifically with the LexA-PS-1 fusion were identified. 

5 Example 2 - PSP1 cDNA Cloning and Sequence Analysis 

The AD fusion plasmids were subjected to restriction digest analysis and 
sequencing as indicated above. Sequence analysis of one of the interacting fusion protein 
cDNAs revealed a 5 19 nucleotide open reading frame (SEQ ID NO: 1) encoding a 173 
amino acid (SEQ ID NO: 2) protein starting with an GGA at position 2 and terminating 
1 0 with a TGA at position 523 of SEQ ID NO; 1 . GenBanfc searches using the BLASTX and 
BLASTN algorithms with the cDNA sequence or with the deduced amino acid sequence 
indicated homology to a portion of the JSL coli serine protease htrA described by Lipinska et 
ah. supra, (SEQ ID NOs: 13 and 14). This novel cDNA was designated PSP1. 
To obtain a greater portion of the cDNA, the oligonucleotide, 5'- 
1 5 CTGGATGGGGAGGTG ATTGG AGTG-3 1 (SEQ ID NO: 15) representing bp 83-1 06 of 
SEQ ID NO: 1 , was used to screen a Superscript human brain cDNA library (Gibco BRL) 
using the Genetrapper cDNA positive selection system (Gibco BRL). Colonies were 
screened using whole ceil PCR or standard hybridization conditions as described by Innis et 
al^PCR Protocols; A Guide to Methods and Applications, Academic Press, San Diego, CA 
2 0 (1990) and Sambrcok et aL, Molecular Cloning: A Laboratory Manual 2nd ed,, Cold 
Spring Harbor Press, Cold Spring Harbor, NY (1989). Those Isolates which contained 
PSP 1 were subjected to restriction digest analysis and sequencing. The longest clones, 
SEQ ID NO: 3 and SEQ ID NO: 5 were sequenced in their entirety. 

Sequence analysis of SEQ ID NO: 3 revealed a 969 nucleotide open reading frame 
2 5 encoding a 323 amino acid (SEQ ID NO: 4) protein starting with a CCC at position 1 and 
terminating with a TGA at position 972 of SEQ ED NO: 3. Sequence analysis of SEQ ID 
NO: 5 revealed a 1 500 nucleotide open reading frame encoding a 423 amino acid (SEQ ID 
NO: 6) protein starting with an CTT at position I and terminating with a TGA at position 
1272 of SEQ ID NO: 5. 
30 A second round of screening was performed using the oligonucleotide, 5*- 

GTCTCTGGGCCCCGGTrGTCTGTTG-3' (SEQ ID NO: 16) representing bp 5-28 of SEQ 
ID NO: 5; the library and screening protocol remained unchanged. In the second round of 
screening, the isolate designated SEQ ID NO: 7 contained the longest cDNA clone. 
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Sequence analysis of SEQ ID NO: 7 revealed a 1 374 nucleotide open reading frame 
encoding a 458 amino acid (SEQ ID NO: 8) protein starting with an ATG at position 25 1 
and terminating with a TGA at position 1627 of SEQ ID NO: 7. However, SEQ ID NO: 7 
docs not have a stop codon upstream from the potential initiation codon. To confirm that 
5 the predicted start codon is authentic, the 5' nucleotide sequence was extended with 5' 

RACE using "Marathon Ready 1 ' human brain cDNA (Clontech) and a nested set of primers. 
A SEQ ID.NO: 7 specific primer S'-CCAACAGACAACCGGGCCCAGAGACT^ 1 (SEQ 
TO NO: 20) and a 5 r anchor primer- 1 (Clontech) was xued in the first PCR amplification and 
a SEQ ID NO: 7 specific primer 5'-TGCCTCCTCGCCCGCCCTACTCAGA-3' (SEQ ID 
1 0 NO: 21) and 5* anchor priraer-2 (Clontech) was used in the second PCR amplification. 

PCR products were T/A cloned into pCR2. 1 (Invitrogen). Eighteen isolates with staggered 
S 1 ends were analyzed and a 5* consensus sequence of 587 nucleotides was generated (SEQ 
ID NO: 22). Alignment of SEQ ED NO: 22 and SEQ ID NO:7 to generate a consensus 
sequence (SEQ ED NO: 23) wdjeates that at nucleotide position 225 there is an in frame. 
1 5 stop codon and the first methionine corresponds to that predicted in SEQ ID NO: 7. This 
gene is designated PSP1-2. 

Consensus full length sequences for the genes designated PSP1-1 (SEQ ID NOs: 24 
and 25), PSP1-3 (SEQ ID NOs: 26 and 27) and PSP1-4 (SEQ ID.NOs: 2 8 and 29) were 
generated from alignment of the 5' consensus sequence (SEQ ID NO: 22), other partial 
2 0 PSP I clones, and with SEQ ID NOs: 7, 3 and 5, respectively. 

Alignment of the deduced amino acid sequence of PSP 1-1 (SEQ ID NO: 25) to E. 
ccli htrA (SEQ ID NO: 14) was accomplished using the BESTTIT algorithm (University of 
Wisconsin Genetics Computer Group). An approximate similarity of 55% and an identity 
of 33.5% at the amino acid leyel was observed and is shown in Fig. I (top, PSPJ-1\ bottom, 
25 E. colihtrX). The critical histidine and serine motif GXSXG conserved in all serine 

proteases is present in PSP 1-1 at amino acid positions 198 and 304-308, respectively, and 
are indicated in bold. Amino acid numbers are indicated at the left and right of the 
sequence alignment 

Nucleotide sequence comparison of PSP '/ -2 t PSPI-L PSP 1-3 and PSP 1-4 using the 
30 PILEUP and PRETTY* algorithms (University of Wisconsin Genetics Computer Group) 
with gap creation and extension penalties of 5.0 and 0.3. respectively, is shown in Fig. 2. 
The alignment results indicate that at nucleotide position 1 54 1 of the alignment, PSP 1-2 
and PSP1-1 contain a 225 bp deletion and PSP1-4 contains a ! 95 bp deletion. Within the 
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same alignment at nucleotide position 1942, PSP I -4 lacks 96 bp that are present in PSP 1-2. 
PSP 1-1 andPSPl-3. At the junction of each deletion site there is a splice site consensus 
sequence AGG orTGG (indicated in bold), suggesting that these alternate forms are due to 
alternative splicing. See Mount,S. in Nucl Acids Res 1Q, 458-472 (1952). The apparent 
5 splicing event at position 1541 results in the removal of astop codon (underlined in Fig. 2) 
that is present in PSP1-3. In addition, PSP1-2 and PSP 1-1 contain a single nucleotide 
difference at position 672 oF the alignment PSP 1-2 contains a Tat this position producing 

the codon TGC which codes for a cysteine while PSP 1-1 contains a C at the same position 

producing the codon CGC which codes for a cysteine. 
1 0 n Nucleotide sequence comparison of PSP /-/ (SEQ ID NO: 24) to the putative 

human serine protease of Ohno et a/., supra, (SEQ ID NO: 17) indicated a 499* identity 

using the GAP algorithm and 65% using the BESTFIT algorithm (data not shown). 

Alignment of the deduced amino acid sequence otPSPl-1 (SEQ ID NO: 25) to the D87258 

protease of Ohno et at. supra, (SEQ ID NO: 18) was accomplished using the BESTFIT 
L5 algorithm and is shown in Fig. 3 (top, PSP1-1 ; bottom, Ohno et al D87258 protease). An 

approximate identity of 46V« at the amino acid level was observed. 

Example 3 - Tissue Distribution of PSP J 

Northern analysis was carried out to determine the distribution of PSP1 mRNA in 

2 0 human tissues. A 30-base oligonucleotide probe directed against the PSP1 sequence was 
used (5 -ATGCTGAACATCGGGAAAGCTrGGTTCTCG-3 1 ) (SEQ ID NO: 19). This 
probe was 3'-end labelled with [ 32 P]-dATP. Northern blots containing mRNA from 
multiple human tissues (Clontech #7750-1, S7760-1, and #7755-1) were hybridized with 
this probe under stringent conditions. A major band of approximately L9kb was detected 

2 5 in all regions investigated: heart, brain, lung, placenta, liver, skeletal muscle, kidney, 
pancreas, amygdala, caudate nucleus, corpus callosum, hippocampus* substantia nigra, 
subthalamic nucleus, thalamus, cerebellum, cerebral cortex, medulla, spinal cord, occipital 
pole, frontal lobe, temporal pole, and putamen. PSP1 mRNA was also detected in 
Alzheimer's disease brain. 
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Example 4 - Detecting the PSP1 polymorphisms 

PSPl oligonucleotides IAFC, lAFTand lARwcre designed for detecting the 
polymorphism at nucleotide 672 (cytidine to thymine) causing the Arg to Cys amino acid 
change. The Allele Specific Oligonucleotides (ASO) IAFC and 1 AFT are identical apart 
5 from their 3' end bases and provide the specificity for screening For the polymorphism. 
IAFC: CAT CCG GCA TTG TTA GCT CTG C 22mer (SEQ ID NO:32) 
I AFT: CAT CCG GCA TTG TTA GCT CTG T 22raer (SEQ IDNO:33) 
I AR: CAA TAG CTG CAT CAG ITT GAA TG 23mer (SEQ ID NO:34) 

10 Pairs of oligonucleotides (IAFC + 1AR, or 1 AFT + 1 AR) were used in a PCR 

under the following conditions: 94°C for -40 seconds, 60*C for 30 seconds, for 35 cycles in 
a reaction containing 1 U KlenTaql (GenPak Ltd.), 50mM Tris-Cl pH9. 1, i6raM 
ammonium sulphate, 35mM MgCt* 150ug ml* 1 BSA and 25ng of human genomic DNA of ■ 
unknown source. Each pair of oligonucleotides was tested against 1 2 random samples of 

1 5 genomic DNA and the products electxophorescd on a 4% agarose (Gibco-BRL) gel. The 
expected product of 95 base pairs was seen for both ASOs in 8 of the 12 DNAs indicating 
that these individuals are heterozygous for this polymorphism. Two of the DNAs amplified 
with only the IAFC oligonucleotide and are thus homozygous for the allele with the 
cytidine at this position. Two of the DNAs amplified with only the I AFT oligonucleotide 

20 and axe thus homozygous for the allele with the thymine at this position. 

PSPI oligonucleotides 1BFC, IBFT and 1BR were designed for detecting the 
polymorphism at nucleotide 1435 (cytidine to thymine) causing the Ala to Val amino acid 
change. 

25 1 BFC: TGG CGG GCT TTG GGG GGC ATT C 22mer (SEQ ID NO:35) 
IBFT: TGG CGG GCT TTG GGG GGC A^T T 22mer (SEQ ID NO:36) 
1BR: GAC GTC AGC AGG GCC CGG AGG TC 23mer (SEQ ID NO:37) 
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Pairs of oligonucleotides (1BFC + 1BR, or 1BFT + 1BR) were used in a PGR under 
the following conditions:94°C for 40 seconds, 67°C for 30 seconds, for 35 cycles in a 
reaction containing 1 UKlenTaql (GenPak Ltd.), 50mMTris-Cl pH9.1, 16mM ammonium 
sulphate; 3.5mM MgCi* 150ug ml* 1 BSA and 25ng of human genomic DNA of unknown 
5 source. Each pair of oligonucleotides was tested against 12 random samples of genomic 
DNA and the products clcctrophoresed on a 4% agarose (Gibco-BRL) gel. The expected 
product of 75 base pairs was seen using the IBFT ASO in 9 of the 12 samples indicating 
that the other 3 individuals have a different allele at this position. 

1 0 Example 5 - Detecting the ©87258 polymorphism 

Oligonucleotides 2AFG, 2AFT and 2AR were designed for detecting the 
polymorphism at nucleotide 1325 (guanine to thymine) causing the Gly to Val amino acid 
change. 

2AFG: GAT ACC CCA GCA GAA GCT GG 20mcr (SEQ ID NO:38) 
1 5 2 AFT; GAT ACC CCA GCA GAA GCT GT 20mer (SEQ ID NO:39) 
2 AR; GCT GAC ATC ATT GGC GG A GAC 2 Imcr (SEQ ID NO:40) 

.Pairs of oligonucleotides (2AFG + 2AR» or 2 AFT + 2AR) were used In a PCR 
under the following conditions: 94°C for 40 seconds, 62°C for 3 0 seconds, for 3 5 cycles in 

20 a reaction containing 1 U KlenTaq 1 (GenPak Ltd.), 50mM Tris-Cl pH9. 1 , 1 6mM 

ammonium sulphate, 3.5mM MgCl 3 , 150ug ml' 1 BSA and 25ng of human genomic DNA of 
unknown source. Each pair of oligonucleotides was tested against 12 random samples of 
genomic DNA and the products electrophoresed on a 4Va agarose (Gibco-BRL) geL The 
2AFT ASO generated a band of approximately 1000 bp. The predicted band was 90 bp. 

2 5 Presumably, the presence of the larger bands was due to the presence of an intron in the 
region flanked by oligonucletides 2AR and 2 ART. Bands were observed in all of the 
samples amplified with 2 AFT indicating that the allele containing the thymine is present in 
all 12 individuals. ' 

30 The present invention may be embodied in other specific forms without departing 

from the spirit or essential attributes thereof, and, accordingly, reference should be made to 
the appended claims, rather than to the foregoing specification, « indicating the scope of 
the invention. 
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SEQ ZD M0:li 

GGGACTCCCC CAAACCAATG TGGAATACAT TCAAACTGAT GCAGCTATTG ATTTTGGAAA 60 

CTCTGGAGGT CCCCTGGTTA ACCTGGATGG GGAGGTGATT GGAGTGAACA CCATGAAGGT 120 

CACAGCTGGA ATCTCCTTTG CCATCCCTTC TGATCGTCTT CGAGAGTTTC TGCATCGTGG 180 

GGAAAAGAAG AATTCCTCCT CCCCAATCAG TCGCTCCCAC CGGCGCTACA TTGGGGTGAT 240 

GATGCTGACC CTGAGTCCCA GCATCCTTGC TGAAC7ACAG CTTCGAGAAC CAAGCTTTCC 300 

CGATGTTCAG CATGGTGTAC TCATCCATAA AGTCATCCTG GGCTCCCCTG CACACCGGGC 360 

TGGTCTGCGG CCTGGTGATG TGATTTTGGC CATTGGGGAG CAGATGGTAC AAAATGCTGA 4 20 

AGATGTTTAT GAAGCTGTTC GAACCCAATC CCAGTTGGCA GTGCAGATCC GGCGGGCACG 4 80 

AGAAACACTG ACCTTATATG TGACCCCTGA GGTCACAGAA TCAATAGATC ACCAACACTA S4 0 

TGAGGCTCCT GCTCTGATTT CCTCCTTGCC TTTCTGGCTG AGGrTCTGAG GGCACCGAGA 600 

CAGAGGGTTA AATCAACCAC TGGGCCCAGC TCCCTCCAAC CACCAGCACT GACTCCTGGG 660 

CTCTGAAGAA TCACAGAAAC* ACTTTTTATA TAAAATAAAA TTATACCTAG CAACAAAAAA 720 

AAAAAAAAAA AA 732 

SEQ ZD KO:2i 

Gly Leu Pro Gin Thr Asn Val Glu Tyr lie Gin Thr Asp Ala Ala lie 
1 5 10 15 ' 

Asp Phe Gly Asn Ser Gly Gly Pro Lou Val Asn Leu Asp Gly Glu Val 

20 25 30 * 

lie Gly Val Asn Thr Met Lys Val -Thr Ala Gly lie Ser Phe Ala lie 

35 40 45 

Pro Ser Asp Arg lea Arg Glu Phe Lou His Arg Gly Glu Lys Lys Asn 

50 55 60 

Ser Ser Ser Gly lie Sex Gly Ser Gin Arg Arg Tyr lie Gly Val Met 

65 70 75 80 

Met Leu Thr Leu Ser Pro Ser He Leu Ala Glu Leu Gin Leu Ara Glu 

85 90 95 

Pro Ser Phe Pro Asp Val Gin His Gly Val Lau He His Lys Val lie 

100 105 1L0 

Leu Gly Ser Pro Ala His Arg Ala Gly Leu Axg Pro Gly Aap Val He 

115 120 125 

Leu Ala He Gly Glu Gin M«t Val Gin Asn Ala Glu Asp Val Tyr Glu 

130 135 140 

Ala Val Arg Thr Gin S«r Cliv L*u Ala Vai Gin Ho Arg Arg Cly Arg 
145 150 155 160 

Glu Thr L«u Thr Lou Tyr Val Thr Pro Glu Val Thr Glu 
165 170 

SCQ ZD HO: 3: 

CCCAGTCTCT GGGCCCGGTT GTCTGTTGGG GTCACTGAAC CCCGAGCATG CCTGACGTCT 60 
GGGACCCCGG GTCCCCGGGC ACAACTGACT GCGGTGACCC CAGATACCAG GACCCGGGAG 120 

31 
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GCCTCAGAGA ACTCTGGAAC CCGTTCGCGC GCGTGGCTGG OGGTGGCGCT GGGCGCTGGG 180 

GGCGCAGTGC TGTTGTTGTT GTGGGGCGGQ GGTCGGGGTC CTCCGGCCGT CCTCGCCGCC 240 

GTCCCTAGCC CGCCGCCCGC TTCTCCCCGG AGTCAGTACA ACTTCATCGC AGATCTGGTG 300 

GAGAAGACAG CACCTGCCGT GGTCTATATC GAGATCCTGG ACCGGCACOC TTTCTTGGGC 360 

CGCGAQGXCC CTATCTCCAA CGGCTCAGCA TTCCTGCTGC CTGCCGATCG GCTCATTGTC 420 

ACCAAOGCCC ATGTGGTGGC TGATCGGCGC AGAGTCCGTG TGAGACTGCT AAGCGGCGAC 480 

ACGTATGAGG CCGTGGTCAC AGCTGTGGAT CCCGTGGCAG ACATCGCAAC GCTGAGGATT 540 

CAGACTAAGG AGCCTCTCCC CAOGCTGCCT CTGGGACGCT CAGCTGATGT CCCGCAAGGG 600 

GAGTTTGTTG TTGCCATGGG AAGTCCCTTT GGACTGGAGA ACAGGATCAC ATCCGGCATT 660 

GTTAGCTCTG CTCAGCGTCC AGCCAGAGAC CTGGGACTCC CCCAAACCAA TCTGCAATAC 720 

ATTCAAACTG AT GCAGCT AT TGATTTTGGA AACTCTGGAG GTCCCCT6GT TAACCTGGTG 780 

AGTCAGACAT CCTTCCTTCC AAGAATCCCT GCCCCAGGTC AGTGTGGGAA GGGTAGGTTT 840 

CCCCXAATTC AAGGATGTTT GGTCAAGTTT CTGAGCAGTT CTTTGTTGGC TATCTCTCAA 900 

TATCCAACCA GATCTCCCCA ACACTTGCTG GTACTTTTGT TCGGGTGCCC CCATCCCCTA 960 

CTATTTGTTT AGGCTAGGGA ACTGGGGGCT GTATCCCTGC AGGATGGCGA CGTGATTGGA 1020 

GTGAACACCA TGAAGGTCAC AGCTGGAATC TCCTTTGCCA TCCCTTCTGA TCGTCTTCGA 1080 

GAGTTTCTGC ATCGTGGGGA, AAAGAAGAAT TCCTCCTCCC GAATCAGTGG CTCCCAGCGG 1140 

CGCTACATTG GGGTGAXGAT GCXGACCCTG AGTCCCAGCA TCCTTGCTGA ACTACAGCTT 1200 

CGAGAACCAA GCTTTCCCGA TGTTCAGCAT GGTGTACPCA TCCATAAAGT CATCCTGGGC 1260 

TCCCCTGCAC ACCGGGCTGG TCTfcCGGCCT GGTGATGTGA TTTTGGCCAT TGGGGAGCAG 1320 

ATGGTACAAA AXGCTGAAGA TGTTTATGAA GCTGTTCGAA CCCAATCCCA GTTGGCAGTG 1380 

CAGATOOGGC GGGGACGAGA AACACTGACC TTATATGTGA CCCCTGAGGT CACAGAATGA 1440 

ATAGATCACC AAGAGTATGA GGCTCCTGCT CTGATTTCCT CCTTGGCTTT CTGGCTGAGG 1500 

TTCTGAGGGC ACCGAGACAG AGGGTTAAAT GAACCAGTGG GGGCAGGTCC CTCCAACCAC 1560 

CAGCACTGAC TCCTGGGCXC TGAAGAATCA CAGAAACACT TTTTATATAA AATAAAATTA 1620 

TACCTAGCAA CATATTATAG TAAAAAATGA GGTGGGAGGG CTGGATCTTT TCCCCCACCA 1680 

AAAGGCTAGA GGTAAAGCTG TATCCCCCTA AACTTAGGGG AGATACTGGA GCTGACCATC 1740 

. CTCACCTCCT ATTAAAGAAA ATGAGCTGCT GAAAAAAAAA AAAAAAA 1787 

eSQ XD WO;4: 

Pro Ser Leu Trp Ala Arg Leu Ser Val Gly Val Thr Glu Pro Arg Ala 
1 5 10 15 

Cys Leu Thr Ser Gly Thr Pro Gly Pro Arg Ala Gin Leu Thr Ala Val 

20 25 30 

Thr Pro Asp Thr Arg Thr Arg Glu Ala Ser £lu Asn Ser Gly Thr Arg 

35 40 45 

Sor Arg Ala Trp Leu Ala Vail Ala Leu Gly Ala Gly Gly Ala Val Leu 

50 55 60 

Leu Leu Leu Trp Gly GLy Gly Arg Gly Pro Pro Ala Val Leu Ala Ala 
65 70 75 80 

Val Pro Ser Pro Pro Pro Ala Ser Pro Arg Ser Gin Tyr Asn Phe He 
SS 90 ' 95 
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Ala Asp Val Val Glu Lys Thr Ala Pro Ala Val Val Tyr He Glu lid 

100 105 110 

Leu Asp Arg His pro Phe Leu Gly Arg Glu val Pro He Ser Asa Gly 

115 120 125 

Sox Gly Phe Val Val Ala Ala Asp Gly Lea lie Val Thr Asn Ala His 

130 135 140 

Val Val Ala Asp Arg Arg Arg Val Arg Val Arg Leu Leu Ser Gly Asp 
145 150 155 150 

Thr Tyr Glu Ala Val Val Thr Ala Val Asp Pro Val Ala Asp Ho Ala 

165 170 175 

Thr Leu Arg He Gin Thr Lys' Glu Pro Leu Pro Thr Lau Pro Leu Gly 

180 185 190 

Arg Ser Ala Asp Val Arg Gin Gly Glu Ph» Val Val Ala Mot Gly Ser 

155 200 205 

Pro Pho Ala Leu Gin Aon Thr He Thr Ser Gly lie Val Ser Ser Ala 

210 215 220 

Gin Arg Pro Ala Arg Asp Leu Gly Leu Pro Gin Thr Asn Val Glu Tyr 
225 230 235 2*0 

lie Gin Thr Asp Ala Ala lie Asp Phe Gly Asn Ser Gly Gly Pro Leu 

245 250 255 

Val Asn Leu Val Ser Glu Thr Ser Phe Leu Pro Arg He Pro Ala Pro 

260 265 270 

Gly Gin Cys Gly Lys Gly Arg Phe Pro Leu He Gin Gly Cys Leu Val 
275 280 285 

* Lys Phe Leu Ser Ser Ser Leu. Leu Ala He Ser Gin Tyr Pro Thr Arg 
290 295 300 

Ser Pro Gin His Leu Leu Val Leu Leu Phe Gly Cys Pro His Pro Leu 
30S 310 315 320 

Leu Phe Val 

SEQ ZD HO:5: 

CTTCGGGCA7 GGCCGGCTTT GGGGGGCATT CGCTGGGGGA GGAGACCCCG TTTGACCCCT 60 

GACCTCCGGG CCCTGCTGAC GTCAGGAACT TCTGACCCCC GGGCCCGAGT GACTTATGGG 120 

ACCCCCAGTC TCTGGGCCCG GTTGTCTGTT GGGGTCACTG AACCCCGAGC ATGCCTGACG 180 

TCTGGGACCC CGGGTCCCCG GGCACAACTG ACTGCGGTGA CCCCAGATAC CAGGACCCGG 240 

GAGGCCTCAG AGAACTCTGG AACCCGTTCG CGGGCGTGGC TGGCGGTGGC GCTGGGCGCr 300 

GGGGGGGCAG TGCTGTTGTT GTTGTGGGGC GGGGGTCGGG GTCCTCCGGC CGTCCTCGCC 360 

GCCGTCCCTA GCCCGCCGCC CGCTTCTCCC CGGAGrCAGT ACAACTTCAT CGCAGATGrG 4 20 

GTGCACAACA CAGCACCIGC CGTGGTCTAT ATCGAGATCC TGGACCGGCA CCCTTTCTTG 4 80 

GGCCGCGAGG 1CCCTATCTC GAACGGCTCA GGATTCGTGG TGGCTGCCGA TGGGCTCAXT 540 

GTCACCAACG CCCATGTGGT CGCTGATCGG CGCAGAGTCC GTGTGAGACT GCTAAGCGGC 600 

GACACGTATG AGGCCG7GGT .CACAGCTGTG GATCCCGTGG CAGACATCGC AACGCTGAGG 660 

33"' ~~ 
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ATTCAGACTA AGGAGCCTCT CCCCACGCTG CCTCTGGGAC CCTCACCTCA TCTCCGGCAA 720 
GGGGAGTTTG TTGTTGCCAT GGGAAGTCCC T7TGCACTGC AGAACACGAT CACATCOGGC 780 
ATTGTTAGCT CTGCTCAGCC tCCAGCCAGA CACCTGGGAC TCCCCCAAAC CAATGTGGAA 840 
TACATTCAAA CTGATGCAGC TATTGATTTT CGAAACTCTG GAGGTCCCCT GGTTAACCTG 900 
GCTAGGGAAC TCGCGGCTGT ATCCCTCCAC CATGGGGAGG TGATTGGAGT GAACACCATG 960 
AAGG7CACAG CTGGAATCTC CTTTGCCATC CCTTCTGATC GTCTTCGAGA GTTTCTGCAT 1020 
CGTGCGGAAA AGAAGAATTC CTCCTCCGGA ATCAGTGGGT CCCAGCGGCG CTACATTGGG 1080 
GTGATGATGC TGACCCTGAG TCCCAGGGCT GGTCTGCGGC CTGGTGATGT GATTTTGGCC 1U0 
ATTGGGGAGC AGATGGTACA AAATGC7CAA GATGTTTATG AAGCTGTTCG AACOCAATCC 1200 
CAGTTGGCAG TGCAGATCCG GCGGGGACGA GAAACACTGA CCTTATATCT CACCCCTCAC 1260 
GTCACAGAAT GAATAGMCA CCAAGAGTAT GAGGCTCCrG CTCTGATTTC CTCCTTGCCT 1320 
TTCTGGCTGA GGTTCTGAGG GCACCGAGAC AGAGGG7TAA ATGAACCAGT GGGOGCAGCT . 1380 
CCCTCCAACC ACCAGCACTG ACTCCTGGGC TCTGAAGAAT CACAGAAACA CTTTTTATAT 1440. 
AAAAIAAAAT TATACCTAGC AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1S00 

1503 

SZQ ZD K0:6i 

Leu Arg Ala Trp Arg Ala Leu Gly Gly Tie Arg Trp Gly Arg Arg Pro 
1 _ 5 10 15 

Arg Lau Thr Pro Asp Lou Arg Ala Leu Leu Thr Ser Gly Thr Ser Asp 

20 25 30. 

Pro Arg Ala Arg Val Thr Tyr Gly Thr Pro Ser Leu Trp Ala Arg Lau 

35 40 45 

Ser Val Gly Val Thr Glu Pro Arg Ala Cys Lau Thr Sar Gly Thr Pro 

50 55 eo 

Gly Fro Arg Ala Gin Leu Thr Ala Val Thr Pro Asp Thr Arg Thr Arg 

70 75 80 

Glu Ala. Ser Glu Asn Ser Gly Thr Arg Ser Arg Ala Trp Leu Ala. Val 

65 90 95 

Ala Leu Gly Ala Gly Gly Ala Val Lau Leu Lau Leu Trp" Gly Gly GLy 

100 105 110 

Arg Gly Pro Pro Ala Val Leu Ala Ala Val Pro Ser Pro Pro Pro Ala 

115 120 125 

Sex Pro Arg Ser Gin Tyr Aan Phe lie Ala Aap Val Val Glu Lys Thr 

130 13S 140 

Ala Pro Ala Val Val Tyr lie Glu He Lev Asp Arg His Pro Phe Lau 
145 150 155 160 

Gly Arg Glu Val Pro He Ser Asn Gly Ser Gly ehe Val Val Ala Ala 

165 170 175 

Asp Gly Leu He Val Thr Aan Ala His Val Val Ala Asp Arg Arg Arg 

180 185 X90 

Val Arg Val Arg Leu Leu Ser Gly Asp Thr Tyr Glu Ala Val Val Thr 
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195 200 205 

Ala Val Asp Pro Val Ala Asp lie Ala Thr Leu Arg He Gin Thr Lys 

210 215 220 

Glu Pro Leu Pro Thr Leu Pro Leu Cly Arg Ser Ala Asp Val Arg Gin 
225 230 235 240 

Gly Glu Ptia Val Val Ala Met Gly Ser Pro Phe Ala Leu Gin Asn Thr 

245 250 255 

He Thr Ser Gly He Val Ser Ser Ala Gin Arg Pro Ala Arg Asp Leu 

260 265 270 

Gly Leu Pro Gin Thr Asn Val Clu Tyr II* Gin Thr Asp Ala Ala He 

275 280 285 

Asp Phe Gly Asn Ser Gly Gly Pro Leu Val Asn Leu Ala Arg Glu Leu 

290 295 300 

Gly Ala Val Ser Leu Gin Asp Gly Glu Val He Gly' Val Asn Thr Met 
305 310 315 320 

Lya val Thr Ala Gly lie Ser Phe Ala lie Pro Ser Asp Arg Leu Arg 

325 330 335 

Glu Phe Leu His Arg Gly Glu Lys Lys Asn Ser Ser Ser Gly lie Ser 

34 0 345 350 

Gly Ser Gin Arg Arg Tyr He Gly Val Mat Met Leu Thr Lou Ser Pro 

353 360 365 

Arg Ala Gly Leu Arg Pro Gly' Asp Val He Leu Ala Ha Gly Glu Gin 

- 370 375 380 

Met Val Gin Asn Ala Glu Asp Val Tyr Glu Ala Val Arg Thr Gin Ser 
385 390 395 400 

Gin Leu Ala Val ain He Arg Arg Gly Arg Glu Thr Leu Thr Leu Tyr 

405 410 415 

Val Thr Pro Glu Val Thr Glu 
420 

SSQ ID KOs7i 

CGCCGGAAGG GCTAGCGGTC CCAGCATACC CCGCGGCCCC TTGGGCCGTC TCACAACTCG * 60 
CGTCCGCCGG AGACCACAAT TCCCGGCATT CGTGGGGCAT GGAGGAGTCG GCCTCCCGGA 120 
ATCCTGGTCC CGCCGTGCAC TTCTGAAGGA CTTCAGGTAC CGGCGTGCCC CGCGTCCTAC 1B0 
TGTCCGCCTG CTCGCGTCCT CGGTGCCGCC TCTGAG*AGG GCGGGCG AGG AGG CAGCCAA 240 
CGCGGAGCTG ATG GCT GCG CCG AGG GCG GGG CGG GGT CCA GGC TGG AGC 289 
Met Ala Ala Pro Arg Ala Gly Arg Gly Ala Gly Trp Ser 
1 5 * 10 

CTT CGG GCA TGG CGG GCT TTG GGG GGC ATT TGC TGG GGG AGG AGA CCC 337 
Leu Arg Ala Trp Arg Ala Leu Gly Gly He Cya Trp Gly Arg Arg Pro 
15 20 25 
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CGT TTG ACC CCT GAC CTC CGG GCC CTG CTG ACG TCA GGA ACT TCT GAC 3 85 

Arg Leu Thr Pro Asp Leu Arg Ala Leu Leu 7hr Ser Gly Thr Ser Asp 
30 35 40 45 

CCC CGG GCC CGA GTG ACT TAT GGG ACC CCC AGT CTC TGG GCC CGG TTG 433 
Pro Arg Ala Arg Val Thr Tyr Gly Thr Pro Sar Lau Txp Ala Arg Lau 
30 33 60 



TCT GTT GGG GTC ACT GAA CCC CGA GCA TGC CTG ACG TCT GGG ACC CCC 481 
Ser Val Gly Val Thr Glu Pro Arg Ala Cya Leu Thr Ser Gly Thr Pro 
65 70 75 

CGT CCC CGG CCA GAA CTG ACT GCC CTC ACC CCA CAT ACC ACC ACC CGG 529 

Gly : Pro Arg Ala Gin Leu Thr Ala Val Thr Pro Asp Thr Arg Thr Arg 
80 85 90 

GAG GCC TCA GAG AAC TCT GGA ACC CGT TCG CGC GCG TGG CTG GCG GTG 577 
^Glu Ala Sar Glu Asn Sar Gly Thr Arg Sar Arg Ala Trp Leu Ala Val 
95 100 105 



GCG CTG CGC GCT CCC CGG CCA CTC CTG TTG TTC TTC TCC CCC GCC CCT €25 

Ala Leu Gly Ala Gly Gly Ala Val Leu Leu Leu Leu Trp Gly Giy Gly 
110 115 120 . 125 

CGG GGT CCT CCG GCC GTC. CTC GCC GCC GTC CCT AGC CCG CCG CCC GCT 673 
Arg Gly Pro Pro Ala Val Lau Ala Ala Val Pro Sar Pro Pro Pro Ala 
130 135 140 

TCT CCC CGG AGT CAG TAC AAC TTC ATC GCA GAT GTG GTG GAG AAG ACA 721 
Sor Pro Arg Ser Gin Tyr Asn Phe He Ala Asp Val Val Glu Lys Thr 
145 150 155 

GCA CCT GCC GTG GTC TAT ATC GAG ATC CTG CAC CGC CAC CCT TTC TTG 769 
Ala Pro Ala Val Val Tyr lie Glu He Leu Asp Arg His Pro Phe Leu 
160 . 165 "* 170 

GGC CGC GAG GTC CCT ATC TCG AAC GGC TCA GGA TTC GTG GTG GCT GCC B17 
Gly Arg Glu Val Pro lie Sar Asn Gly Sar Gly Phe Val Val Ala Ala 
175 180 • 185 

GAT GGG CTC ATT GTC ACC AAC GCC CAT GTG CTG GCT GAT CGG CGC AGA 865 
Asp Gly Leu He Val Thr Asn Ala His Val Yal Ala Asp Arg jpxg"~Arg~ 
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«° 195 200 205 

GTC CGT GTG AGA CTG CTA AGC GGC GAC ACC TAT GAG GCC GTG GTC ACA 913 
Val Arg Val Arg Leu Leu Ser Gly Asp Thr Tyr Glu Ala VaL Val Thr 
21 ° 215 220 

GCT GTG GAT CCC GTG GCA GAC ATC GCA ACG CTG AGG ATT CAG ACT AAG 951 
Ala Val A»p Pro Val Ala Asp lie Ala Thr Leu Arg lie Gin Thr- Lys 
225 230 23S 



GAG CCT CTC CCC ACG CTG CCT CTG GGA CGC ?CA GCT GAT GTC CGG CAA 
Glu Pro Leu Pro Thr Leu Pro Leu Gly Arg Ser Ala Asp Val Arg Gin 
240 . 245 250 



1009 



GGG GAG TTT GTT GTT GCC ATG GGA AGT CCC TTT GCA CTC CAC AAC ACG 1057 
Gly Glu Phe Val Val Ala Met Gly Ser Pro Phe Ala Leu Gin ASn Thr 
255 _ 260 265 

ATC ACA TCC GGC ATT GTT AGC TCT GCT CAG CGT CCA GCC AGA GAC CTG 1105 
lie Tar Ser Gly He Val Ser Ser Ala Gin Arg Pro Ala Arg Asp Leu 
270 275 280 285 

GGA CTC CCC CAA ACC AAT CTC CAA TAC ATT CAA ACT CAT CCA GCT ATT 1153 
Gly Leu Pro Gin Thr Asn Val Glu. Tyr lie Cln Ttir Asp Ala Ala He 
290 295 300 

GAT TTT GGA AAC TCT GGA GGT CCC CTG GTT AAC CTG GAT GGG GAG GTG 1201 
Asp Phe Gly Asn Sar Gly Gly Pro Lqu Val Asn Leu Asp Gly Glu Val 
305 310 315 

ATT GGA GTG AAC ACC ATG AAG GTC ACA GCT GGA ATC TCC TTT GCC ATC 124 9 
lie Gly Val Asn Thr Met Lys Val Thr Ala Gly He ser Phe Ala He 
320 325 ^ " ' 330 

CCT TCT GAT CGT CTT CGA GAG TTT CTG CAT CGT GGG GAA AAG AAG AAT 1297 
Pro Ser Asp Arg L«u Arg Glu Phe Leu His Arg Gly Glu Lys Lys Asn 
335 . 340 -345 

TCC TCC TCC GGA ATC AGT GGG TCC CAG CGG CGC TAC ATT GGG GTG ATG 134 5 
Ser Ser Ser Gly He Ser Gly Ser Gin Arg" Arg Tyr He Gly Val Mot 
350 355 360 365 
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ATG CTG ACC CTG ACT CCC AGC ATC CTT GCT CAA CTA CAG CTT CGA GAA 1393 
Met Lau Thr Leu Sar Pro Sar Ila Lau Ala Glu Leu Gin Leu Arg Glu 
370 375 380 

CCA AGC TTT CCC GAT GTT CAG CAT GGT GTA CTC ATC CAT AAA GTC ATC 1441 
Pro Ser Phe Pro Asp Val Gin Hia Gly Val Lau tla KLs Lya Val He 
385 390 395 

CTG GGC TCC CCT GCA CAC CGG GCT GGT CTG CGG CCT .GGT GAT GTG ATT 1489 
Lau Gly Sar Pro Ala His Arg Ala Gly Leu Arg Pro GLy Aap Val rie 
400 405 4L0 

TTG GCC ATT GGG GAG CAG ATG GTA CAA AAT GCT GAA GAT GTT TAT GAA 1537 
Lau -Ala Ila Gly Glu Gin Met Val Gin Aan Ala Glu Aap Val Tyr Glu 
U5 420 425 

GCT GTT CGA ACC CAA TCC CAG TTG GCA GTG CAG ATC CGG CGG GCA* CGA 1585 
Ala Val Arg Thr Gin Ser Gin Leu Ala Val Gin He Arg Arg Gly Arg 
430 435 440 445 

GAA ACA CTG ACC TTA TAT GTG ACC CCT GAG GTC ACA GAA TGAATAGATC ACC 1637 
Glu Thr Leu Thr Leu Tyr Val Thr Pro Glu Val Thr Glu 
450 455 

AAGAGTATGA GCCTCCTGCT CTGATTTCCT CCTTCCCTTT CTGGCTGAGG TTCTGAGGGC 1697 

ACC6AGACAC AGGGTTAAAT GAACCAGTGG GGGCAGGTCC CTCCAACCAC CAGCACTGAC 1757 

TCCTGGGCTC TGAAGAATCA CAGAAACACT TTTTATATAA AATAAAATTA TACCTAGCAA 1817 

CATAAAAAAA AAAAAAAA 1835 

S£Q ZD KO:8: 

Mat Ala Ala Pro Arg Ala Gly Arg Cly Ala Gly Trp Sar Leu Arg Ala 
•1 * 3 10 • 15 

Trp Arg Ala Lau Gly Cly Ila Cya Trp Gly Arg Arg Pro Arg Leu Thr 

20 25 3d 

Pro Asp Lau Arg Ala Leu Leu Thr Ser Gly Thr Ser Asp Pro Arg Ala 

35 40 45 

Arg Val Thr Tyr Gly Thr Pro Ser Leu Trp Ala Arg Leu Ser Val Gly 

50 55 60 

V«l Thr Glu Pro Arg Ala Cya Leu Thr Ser Gly Thr Pro Gly Pro Arg 
63 70 75 80 

Ala Gin Leu Thr Ala Val Thr Pro Asp Thr Arg Thr Arg Glu Ala Ser 
85 90 95 
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Glu Asn Sar Gly The Arg Ser Arg Ala Trp Leu Ala Val Ala Leu Gly 

100 105 110 

Ala Gly Gly Ala Val Leu Lau Leu Lau Trp Gly Gly Gly Arg Gly Pro 

115 120 125 

Pro Ala Val Leu Ala Ala Val Pro Ser Pro Pro Pro Ala Sar Pro Arg 

130 135 140 

Ser Gin Tyr Asn Phe lie Ala Ajp Val Val Glu Lys The Ala Pro Ala 
145 150 155 160 

Val Val Tyr lie Glu He Lou Asp Arg His Pro Phe Leu Gly Arg Glu 

165 170 175 

Val Pro lie Sor Asn Gly Sar Gly Phe Val Val Ala Ala Asp Gly Leu 

160 185 190 

lie Val Vhr Asn Ala His Val Val Ala Asp Arg Arg Arg Val Arg Val 

195 200 .205 

Arg Lau Leu Sor Gly Asp Thr Tyr Glu Ala Val Val Thr Ala. Val Asp 

210 215 220 

Pro Val Ala Asp lie Ala Thr Leu Arg He Gin Thr Lys Glu Pro Lou 
225 * 230 235 240 

•^PCO Thr Leu Pro Leu Gly Arg Sor Ala Asp Val Arg Gin Gly Glu Pha 
245 250 255 

Val Val Ala Met Gly Ser Pro Pho Ala Leu Gin Asa Thr 21a Thr Ser 

260 265 270 

Gly lie Val Ser Sor Ala Gin Arg Pro Ala Arg Asp Leu Gly Leu Pro 

275 280 285 

Gin Thr Asa Val Glu 7yr He Gin Thr Asp Ala Ala lie Asp Phe Gly 

290 295 300 

Asn Sor Gly Gly Pro Leu Val Ash Leu Asp Gly Glu Val Ho Gly Val 
305 310 315 320 

Asn Thr Met Lys Val Thr Ala Gly He Ser Phe Ala He Pro Ser Asp 

325 330 • 335 

Arg Leu Arg Glu Phe Leu His Arg Gly Glu Lys Lys Asn Ser Ser Ser 

340 345 350 

Gly He Ser Gly Ser Gin Arg Arg Tyr He Gly Val Met Met Leu Thr 

355 360 365 

Leu Ser Pro Ser He Leu Ala Glu Leu Gin' 1 ' Leu Arg Glu Pro Ser Phe 

370 375 330 

Pro Asp Val Gin His Gly Val Lou He His Lys Val He Leu Gly Sor 
385 390 395 400 

Pro Ala His Arg Ala Gly Lou Arg Pro Gly Asp Val He Lou Ala Ilo 

405 410 415 

Gly Glu Gin Met Val Gin Asn Ala Glu As? Val Tyr Glu Ala Val Arg 
420 42S 430 . 
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7hr Gin Scr Gin Leu Ala Val Gin lie Arg Arg Gly Arg Glu 7fcr leu 

<35 440 445 

Thr Leu Tyr Val Thr Pro Glu Val Thr Glu 
450 453 

SEQ XD NO*: 9 : 

TGGGACAGGC AGCTCCGGGG TCCGCGGTTT CACATCGGAA ACAAAACAGC CGCTGG7CXG 60 

GAAGGAACCT GAGCTACGAG CCGCGGCGGC AGCGGGGCGG CGGGGAAGCG TATACCTAA7 120 

CTGGCAGCCT GCAAGTGACA ACAGCCTTTG CGGTCCrTAG ACAGCT7GGC CTGOAGGAGA 180 

ACACAXGAAA GAAAGAACCT CAAGAGGCTT TGTTXrCTGT CAAACACTAT TTC7A7ACAG 240 

TTGCXCCAAT GACAGAGX1A CCTGCACCGT TGTCCTACTT CCAGAATGCA CAGATGTCTG 300 

AGGACAACCA CCTGACCAAT ACTGTACGTA CCCAGAATGA CAATAGAGAA CGGCAGGAGC 360 

ACAACGACAG ACGGAGCCXX GGCCACCCTG AGCCATTATC TAATGGACGA CCCCAGGGTA 420 

ACXGCCGGCA GGTGGTGGAG CAAGATGAGG AAGAAGATGA GGAGCTGACA TTGAAATATG 480 

GCGCCAAGCA TGTGATCATG CTCrrrGTCC CTGTGACTCT C7GCATGGXG GTGGTCGTGG 540 

CTACCATTAA GXCAGXCAGC TTXrA?ACCC GGAAGGATGG GCAGC7AATC XAXACCCCAT 600 

TCACAGAAGA TACCGAGACT GTGGGCCAGA GAGCCCTGCA CTCAA77CXG AATGCTGCCA 660 

TCATGATCAG XGXCAXXGXX GTCATGACTA TCCTCCTGGT GG7XC7GTAT AAATACAGGT 720 

GCXATAAGGT CATCCAXGCC TGGCrrATTA TATCATCTCT A7TGT7CCXG TTCTTTTTTT 780 

CATTCATTTA CTTGGGGGAA GTGrTTAAAA CCTATAACGT TGCTG7GGAC TACATTCACTG 840 

TTGCACXCCT GATCTGGAAr TTTGGTGTGG TGGGAATGAT TTCCA77CAC TGGAAAGGTC. 900 

CACXTCGACT CCAGCAGGCA TAXCTCATTA TGATTAGTGC CCTCATGGCC CTGGTGTTTA S60 

TCAAGTACCT CCCTGAATGG ACTGCGTGGC TCATCTTGGC TGTGA7TTCA GTATATGATT 1020 

TAGTGGCTGT TTTGTGTCCG AAAGGTCCAC TTCGTATGCT GGTTGAAACA GCXCAGGAGA 1080 

GAAATGAAAC GCTTTTTCCA GCTCTCATTT ACTCCTCAAC AATGG7GTGG TTGGTGAATA ' 1140 

XGGCAGAAGG AGACCCGGAA GCTCAAAGGA GAGTATCCAA AAATTCCAAG TATAATGCAG 1200 

AAAGCACAGA AAGGGAGTCA CAAGACACXG .TTGCAGAGAA TGATGATGGC GGGTTCAGXG 1260 

' AGGAATGGGA AGCOCAGAGG GACAGTCATC TAGGGCCTCA TCGC7C7AQA CCTGAGTCAC 1320 

GAGCTGCTGT CCAGGAACTT TCCAGCAGTA TCCTCGCTGG TGAAGACCCA GAGGAAAGGC 1380 

GAGTAAAACT TGGAXXGGGA GATTTCArTT TCTACAGTGT TCTGGTTGGT AAAGCCTCAG 1440 

CAACAGOCAG TGCAGACTCG AACACAACCA TAGCCTCTTT .CCTACCCAXA TTAATTCGTT 1500 

TGTGCCTTAC ATTAXXACXC CTTGCCATTT TCAAGAAAGC ATTGCCAGCT CTXCCAATCT 1560 

CCAXCACCTT TGCGCTrCXX TTCXACXTTC CCACACATTA TCTTCTACAC CCTTTTAXGG 1620 

ACCAAXXAGC AXXCCAXCAA TTXrATAICT AGCATATTTG CGGT7AGAAX CCCATGGATG 1680 

TXTCTTCXTT GACXAX AAC C AAAfCTGGGG AGGACAAAGG TGAT777CCX GTGTCCACAX 1740 

CXAACAAAG7 CAAGAIXCCC GGCTGGACTT TTGCAGCTTC CTTCCAAGTC XTCCTGACCA 1800 

.CCXTGCACTA XTGGACXXXG CAAGGAGGXG CCTAXAGAAA ACGAT77XGA ACATACXXCA .1860 

TCGCAGTGGA CTGXCTCCCT CCGTGCAGAA ACTACCAGAT 77CACCCACC AGGTCAAGGA 1920 

GAXATGAXAG GCCCGGAAGI TGCrGTGCCC CATCAGCAGC XTGACGCGTG GTCACAGGAC 1980 

GAXTTCACTG ACACXGCGAA CTCTCAGGAC TACCGGTTAC CAAGAGGTXA GGXGAAGXGG 2040 

XXXAAAOCAA ACGGAACTCX TCArCTTAAA CTACACGT7G AAAA7CAACC CAA7AAXTCT 2100 

GTAXTAACTG AAXTCTGftAC TTTTCAGGAG GTACTGTGAG GAAGAGCAGG CACCAGCAGC 2160 
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AGAATGCGGA ATGGAGAGGT GGGCAGGGGT TCCAGCTTCC 
CTCATCCTTT TTAAATGAGA CTVGVTTTCC CCTCTCTTTG 
TTGCCTTTGG CAATTCTTCT TCTCAAGCAC TGACACTCAT 
TCTTCCCAAG GCCAGTCTGA ACCTGACCTT' CCTTTATCCT 
CCAAATTCAG TAAATTTTGG AAACAGTACA GCTATTTCTC 
AAGTCAAATT TGGAXTTTCC ACCAAATTCT GAATTTGTAG 
GCCCCCAGAT GCCTCCTCTG TCCrCATTCT TCTCTCCCAC 
GCCAGTAAGG CAGCTCTGTC RrGGTAGCAG ATGGTCCCAT 
TGTATGATGA AAAGAATGTG TTATGAATCG GTGCTGTCAG 
CCACAGCAAA TGAGATGTAT GCCCAAAGCG GTAGAATTAA 
GAAG 
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CTTTGATTTT 


TTGCTGCAGA 


2220 


AGTCAAGTCA 


AATA7GTAGA 


2260 


TACCGTCTGT 


GATTGCCATT 


2340 


AAAAGTTTTA 


ACCTCAGGTT 


2400 


ATCAATTCTC 


TA7CA7GTTC 


2460 


ACATACTTGT 


ACOCTCACTT 


2520 


ACAAGCAGTC 


T7T7TCTACA 


2S80 


TAT T CT AG GG 


7CTTACTCTT 


2640 


CCCTGCTGTC 


AGACCTTCTT 


2700 


AGAAGAGTAA 


AA7GGCTGT7 


2760 






2764 



SZQ ID NO:10: 

Met Thr Glu Leu Pro Ala Pro Lou Ser Tyr Phe Gin Asn Ala Gin Met 
1 5 10 IS 

Ser Glu Asp Asn His Leu Ser Asn Thr Val Arg Sar Gin Asn Asp Asn 

20 25 30 

Arg Glu Arg Gin G]Lu Ris Asn Asp .Arg Arg Ser Leu Gly His Pro Glu 

35 40 45 

Pro Leu Ser Asn Gly Arg Pro Gin Gly Aan Ser Arg Gin Val Val Glu 

50 55 60 

Gin Asp Glu Glu Glu Asp Glu Glu Leu Thr Leu Lys Tyr Gly Ala Lys 
65 70 75 • 80 

His Val He Met Leu Pha Val Pro Val Thr Leu Cys Mat Val Val Val 

85 90 .93 

Val Ala Thr Ha Lys' Ser Val Sax Ph* Tyr Thr Arg Lys Asp Gly Gin 

100 . 105 110 

Leu He Tyr Thr Pro Pha Thr Glu Asp Thr Glu Thr Val Gly Gin Arg 

115 120 125 

Ala Leu His Sac Cla Lau Asa Ala Ala Ha Mat Xla Sar Val Ho Val 

130 135 140 

Val Met Thr He Leu Lea Val Val Leu Tyr Lys Tyr Arg Cys Tyr Lys 
145 150 155 160 

Val Ha His Ala Trp Leu He He Ser Sen- Leu Leu Leu Leu Phe Phe 

165 170 175 

Phe Ser Phe He Tyr Leu Gly Glu Val Pha Lys Thr Tyr Asn Val Ala 

180 185 190 

Vftl Asp Tyr He Thr Val Ala Leu Leu He Trp Asn Phe Gly Val Val 

195 200 205 

Gly Met He Sec lie His Trp Lys Gly Pro Leu Arg Leu Gin Gin Ala 
210 215 * 220 
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Tyr Leu He Met He Ser Ala leu Met Ala Leu Val Pha He Lys Tyr 
225 230 235 24 0 

Leu Pro Glu Trp Thr Ala Trp Leu He Leu AU Val He Ser Val Tyr 

245 250 255 

Asp Lea Val Ala Val Leu Cys Pro Lys Gly Pro Leu Arg Met Leu Val 

260 265 270 

Glu Thr Ala Gin Glu Arg Aan Clu Thr Leu Phe Pro Ala Leu He Tyr 

275 280 285 

Ser Ser Thr Net Val Trp Leu Val Aan Met Ala Glu Gly Asp Pro Glu 

230 295 300 

Ala CLn Arg Arg Val Ser Lys Asn Ser Lys Tyr Ajto Ala Glu Ser Thr 
303 310 . 315 320 

Giu Arg Clu Ser Gin Asp Thr Vol Ala Glu Asn Asp Aap Gly Gly Phe 

325 330 335 

Sec Clu Clu Trp Glu Ala Gla Arg Aap Ser Hia Leu Gly Pro Els Arg 

340 345 350 

Ser Thr Pro Glu Ser Arg Ala Ala Val Cla Clu Leu Ser Ser Ser He 

335 360 365 

Leu Ala Gly Glu Asp Pro Glu Clu Arg Cly Val Lys Leu Gly Leu Gly 

370 373 380 

Aap Phe He Phe Tyr Ser Val Leu Val Gly Lys Ala Ser Ala Thr Ala 
385 390 * 395 400 

Sec Gly Aap Trp Asa Thr Thr. He Ala Cye Phe Val Ala He Leu He 

405 410 415 

Gly Leu Cye Leu Thr Leu Leu Lou Leu Ala II a Phe Ly« Lys Ala Leu 

420 425 430 

Pro Ala Leu Pro lie Ser He Thr Phe Gly ten Val Phe Tyr Phe Ala 

435 440 445 

thr Asp Tyr Leu Val Gin Pro Phe Met Asp Gin Leu Ala Phe Hie Gin 

450 455 460 

Phe Tyr He 
465 

51 Q ZZ> NO: Hi 

CGGAATTCCG TATGCTGGTT GAAACA 26 
SEQ ZD K0tl2: 

CGGGA7CCTC AGGCTACGAA ACACCCTAT 29 
SZQ ID NO: 13: 

TATATCAGCG GTATGACCGA CCTCTATGCG TGGGATGAAT ACCGACGTCT GATGGCCGTA 60 
GAACAATAAC CAGGCTTTTG TAAASACGAA CAATAAATTT TTACCTTTTG CACAAACTTT 120 
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AGTTCGGAAC TTCAGGCTAT AAAACGAATC TGAAGAACAC AGCAATTTTG CGTTATCfGT 180 
TAATCGAGAC TGAAATACA7 GAAAAAAACC ACAVTAGCAC TGAGTCGACT GGCTCTGAGT 24 0 
TTAGGTTTGG CGTTATCTCC GCTCTCTGCA ACGGCGGCTG AGACTTCTTC AGCAACGACA 300 
GCCCAGCAGA TGCCAAGCC7 TGCACCGATG CTCGAAAAGG TGATGCCTTC AGTGGTCAGC 360 
ATTAACGTAG AAGGIAGCAC AACCGTTAAT . ACGCCGCGTA TGCCGCGTAA TTTCCAGCAC 4 20 
TTCTTOGGTG ATGATTCTCC GTTCTGCCAG GAAGGTTCTC CGTTCCAGAG CTCTCCGTTC 4i0 
TCCCAGGGTG GCCAGGGCGG TAATGGTGGC GGCCAGCAAC AGAAATTCAT GGCGC7GGGT 540 
TCCGGOGTCA TCATTGATCC CGATAAAGGC TATGTCGTCA CCAACAACCA CGTTGTTGAT 600 
AACGCGACGG TCATTAAAGT TCAACTGAGC GATGGCCGTA AGTTCGACGC GAAGATGGTT 660 
GGCAAAGATC CGCGCTC7GA TATCGCGCTG ATCCAAATCC AGAACCCGAA AAACCIGACC 720 
CCAATTAAGA TGGCGGA77C TGATGCACTG CGCGTGGGTG ATTACACCGT AGGGAtXGGT 780 
AACCCGTTTG GTCTGGGCGA GACGGTAACT TCCGGGATTG .TCTCTGCGCT GGGGCGTAGC 8 40 
GGCCTGAATG CCGAAAAC7A CGAAAACTTC ATCCAGACCG ATGCAGCGAT CAACCGTCGT 900 
AACTCCGGTG GTGCGCTGGT TAACCTGAAC GGCGAACTGA TCGGTATCAA CACCGCGATC 960 

CTCGCACCGG ACGGCGGCAA CATCGGTATC GGTTTTCCTA TCCC6AGTAA CATGGT G AAA 1020 

AACCTGACCT CGCAGATGGT. GGAATACGGC CAGGTCAAAC CCGGTGAGCT GGGTATTATG 1080 

GGGACTGAGC TGAACTCCGA ACTGGCGAAA GCGA7GAAAG TTGACGCCCA GCGCGGTGCT 1140 

TTCG7AAGCC AGGTTCTGCC TAATTCCTCC GCTGCAAAAG CGGGCATTAA AGCGGGTGAT 1200 

GTGATCACCT CACTGAACGG TAAGCCGATC AGCAGCTTTG CCGGACTGCG TGCTCAGGTG 12 60 

GGTACTAXGC CGGTAGGCAG CAAACTGACC CTGGGCTTAC TGCGCGACGG TAAGCAGGTT 1320 

AACGTGAAJCC TGGAACTCCA GCAGAGCACC CAGAA7CAGG TTGATTCCAG CTCCATCTTC 1380 

AACGGCATTG AAGGCGC7GA GATGAGCAAC AAAGGCAAAG ATCAGGGCGr GGTAGTGAAC 1440 

AACGTGAAAA CCCCCACTCC CCCTGCGCAG ATCGGCCTGA AGAAAGGTGA TGTGATTATT 1500 

GGCGCGAACC AGCAGGCAGT GAAAAACATC GCTGAACTGC GTAAAGTTCT CGACAGCAAA 1560 

CCGTCTGTGC TGGCACTCAA CATTCAGCGC GGCGACCGCC ATCTAGCTGT TAA7GCAGTA 1620 

ATCTCCCTCA ACCCCTTCCT GAAAACQGGA AGGGGTTCTC CTTACAATCT GTGAACTTCA 1680 

CCACAACTCC ATACATCTTC ATCATCCTTT ■ AGGCATTTGC ACAATGCCGr ACGTTACGTA 1740 

CTTCCTTArG CTAAGCCG1G CATAACGGAG GACTTATGGC TGGCTGGCAX CTTGATACCA 180O 

AAATGGCGCA GGATATCCTG GCACGTACCA TGCGCATCAT CGATACCAAT ATCA 1854 



SEQ XD MOrl4i 

Met Lys Lys Thr Thr Leu Ala Leu : 
1 5 
Leu Ala Leu Ser Pro Leu Ser Ala 
20 

Thr The Ala Gin Oln MeC Pra Ser 

35 40 
Met Pro Ser Val Val Ser He Asn 

50 55 
Thr Pro Arg Met Pro Arg Asn Phe 
65 70 
Pro Phe Cys Gin Glu Gly Ser Pro 



ier Arg Leu Ala Leu Ser Leu Gly 

10 15 
Thr Ala Ala Glu Thr Ser Ser Ala 
25 30 
Leu Ala-tfro Met Leu Glu Lys Val 
45 

Val Glu Gly Ser Thr Thr Val Asn 
60 

Gin Gin Phe Phe Gly Asp Asp Ser 

75 80 
Phe Gin Ser Ser Pro Phe Cys Gin 
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cast /ti^o 

85 90 95 

Gly Gly Gin Gly Gly Asn Gly Gly Gly Gin Gin Gin Lys Phe Met Ala 

100 105 no 

Leu Gly Ser Gly Val lie Ila Asp Ala Asp Lys Gly Tyr Val' Val Thr 

115 120 125 

Asn Asn His Val Val Asp Asn Ala The Val lla Lys Val Gin Leu Ser 

130 135 140 

Asp Gly Arg Lya PKe Asp Ala Lya Mot Val Gly Lya Asp Pro Arg Ser 
1< 5 150 155 iso 

Asp Ila Ala Leu lie Gin Ila Gin Asn Pro Lys Asn Leu Thr Ala lie 

165 170 175 

Lya Met Ala Asp Ser Asp Ala Leu Arg Val Gly Asp ryr Thr Val GLy 

160 185 190 

lie Gly Asn Pro Phe Gly Leu Gly Glu Thr Val Thr Ser Gly lie Val 

195 200 205 

Ser Ala Leu Gly Arg Ser GLy Leu Asn Ala Glu Asn Tyr Glu Asn Phe 

210 , 215 220 

lie Gin Thr Asp Ala Ala tie Asn Arg Gly Asn Ser Gly Gly Ala Leu 
225 230 ~ ' 233 240 

Val Asn Leu Asn Gly Glu Leu He Gly lie Asn Thr Ala He Leu Ala 

245 * 250 255 

Pro Asp Gly Gly Asn He Gly He Gly Phe Ala He Pro Ser Asn Met 

260 265 270 

Val Lys Asn Leu Thr Ser Gin Met Val Glu Tyr Gly Gin Yal Lys Arg 

275 280 " 285 

Gly Glu Leu Gly lie Met Gly Thx Glu Leu. Asn Ser Glu Leu Ala Lys 

290 2$5 300 

Ala Met Lys Val Asp Ala Gin Arg Gly Ala Phe Val Ser Gin Val Leu 
305 310 315' 320 

Pro Asn Ser Ser Ala Ala Lys Ala Gly He Lys Ala Gly Asp Val He 

325 330 - 335 

Thr Ser Leu Asn Gly Lys Pro He Ser Ser Phe Ala Ala Leu Arg Ala 

340 ' 345 350 

Gin Val Ciy Thr Met Pro Val Gly Ser Lys Leu Thr Leu Gly Leu Leu 

355 360 > 365 

Arg Asp Gly lys Gin Val Asn Val Aan Leu Glu Leu Gin Gin Ser Ser 

370 375 380 

Gin Asn Gin Val Asp Ser Ser Ser He Phe Asn Gly He Glu Gly Ala 
385 390 395 400 

Glu Met Ser Asn Lys Gly Lys Asp Gin Gly Val Val Val Asn Asn Val 

<05 410 415 

Lys Thr Gly Thr Pro Ala Ala Gin He Gly Leu Lys Lys Gly Asp Val 
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420 

lie lie Gly Ala Asn Gin 
435 

Lys Val X«u Asp Ser Lys 
450 

Gly Asp Arg Hid Leu Fro 
465 470 
Leu Lys Thr Gly Arg Gly 
465 
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4 25 

Gin Ala Val Lys Asn lie 
440 

Fro Ser Val Leu Ala Leu 
455 460 
Val Asn Ala Vai lie Ser 
475 

Ser Pro Tyr Asn Leu 
490 



430 

Ala Glu Leu Arg 
445 

Asn lie Gin Arg 

Leu Asn pro Pne 
480 
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SEQ ID HO: 15: 

CTGGATGGGG AGGTGATTGG AGTG 24 
SEQ ZD NO:16: 

GTCTCTGGGC CCCGGTTGTC TGTTG 25 
SEQ ZD NOjl7; 

Faatuz** polymoxphiaza at 1325 

CCGGCCCTCG CCCTGTCCGC CGCCACCGCC GCCGCCGCCA GAGTCGCCAT GCAGATCCCG 60 

CGCGCCGCTC TTCTCCCGCT GCTGCTGCTG CTGCTGGCGG CGCCCGCCTC GGCGCAGCTG 120 

TCCCGGGCCG GCCGCTCGGC GCCTTTGGCC GCCGGGTGCC CAGACCGCXG CGAGCCGGCG 180 

CGCTGCCCGC CGCAGCCGGA GCACTGCGAG GGCGGCCGGG CCCGGGACGC GTGCGGCTCC 240 

TGCGAGGTGT GCGGCGCGCC CGAGGGCGCC GCGTGCGGCC TGCAGGAGGG CCCGTGCGGC 300 

GAGGGGCTGC AGTGCGTGGT GCCC1TCGCG GTGCCAGCCT CGGCCACGGT GCGGCGGCGC 360 

GCGCAGGCCG GCCTCTGTGT GTGCGCCAGC AGCGAGCCGG TGTGCGGCAG CGACGCCAAC 420 

ACCrACGCCA ACCTGTGCCA GCTGCGCGCC CCCAGCCGCC GCTCCGAGAG GCTGCACCGG 480 

CCGCCGGTCA TCGTCCTGCA GCGCGGAGCC TGCGGCCAAG GGCAGGAAGA TCCCAACAGT 54 0 

TTGCGCCATA AATATAACTT TATCGCGGAC GTGGTGGAGA AGATCGCCCC TGCCGTGGTT 600 

CATATCGAAT TGTTTCGCAA GCTJTCCGTTT TCTAAACGAG AGGTGCCGGT GGCTAGTGGG 660 

TCTGGGT T TA TTGTGTCGGA AGAT GGACTG ATCGTGACAA AT GCCGACGT GGTGACCAAC 720 

AAGCACCGGG TCAAAGTTGA GCTGAAGAAC GGTGCCACTT ACGAAGCCAA AATCAAGGAT 780 

GTGGATGAGA AAGCAGACAT CGCACTCATC AAAATTGACC ACCAGGG CAA GCTGCCTCTC 84 0 

CTGCTGCTTG GCCGCTCCTC AGAGCTGCGG CCGGGAGAG? TCGTGGTCGC CATCGGAAGC 300 

CCGTTTTCCC TTCAAAACAC AGTCACCACC CGGATCGTC* GCACCACCCA GCGAGGCGGC ' 960 

AAA6AGCTGG GGCTCCGCAA CTCAGACATG GACTACATCC AGACCGACGC CATCATCAAC 1020 

TATGGAAACT CGGGAGGCCC GTTAGTAAAC C7GGACGGTG AAGTGATTGG AATTAACACT 1080 

TTGAAAGTGA CAGCTGGAAT CTCCTTTGCA A7CCCATC7G ATAAGATTAA AAAGTTCCTC . 1140 

ACGGAGTCCC ATGACCGACA GGCCAAAGGA AAAGCCATCA CCAAGAAGAA GTATATTGGT 1200 

ATCCGAATGA TGTCACTCAC GTCCAGCAAA GCCAAAGAGC TGAAGGACCG GCACCGGGAC 1260 

TTCCCAGACG TGATCTCAGG AGCGTATATA A7TGAAGTAA TTCCTGArAC CCCAGCAGAA 1320 

GCTGKTGGTC TCAAGGAAAA CGACGTCATA A7CAGCA7CA ATGGACAGTC CGTGGTCTCC 1380 

GCCAATGATG TCAGCGACGT CATTAAAAGG GAAAGCACCC TGAACATGGT GGTCCGCACG 144 0 
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GGTAATGAAG ATATCATGAT CACAGTGATT CCCGAAGAAA TTGACCCATA GGCAGAGGCA 1S0O 

TGAGCTGGAC TTCATGTTTC CCTCAAAGAC TCTCCCGTGG ATGACGGATG AGGACTCTGG 15 60 
GCTGCTGGAA TAGGACACTC AAGACTTTTG ACTGCCATTT TGTTTGTTCA GTGGAGACTC " 1620 

CCTGGCCAAC AGAATCCTTC TTGATAGTTT GCAGGCAAAA CAAATGTAAT GTTGCAGATC 1680 

CGCAGGCAGA AGCTCTGCCC TTCTGTATCC TATGTATGCA GTGTGCTTTT TCTTGCCAGC 17 40 

TTGGGCCATT CTTGCTTAGA CAGTCAGCAT TTGTCTCCTC CTTTAACTGA GTCATCATCT 1800 

TACTCCAACT AATGCAGTCG ATACAATGCG TAGATAGAAG AAGCCCCACG GGAGCCAGGA I8 60 

TGGGACTGGT CGTGTTTGTG CTTTTCTCCA AGTCAGCACC CAAAGGTCAA TGCACAGAGA 1920 

CCCCGGGTGG GTGAGCGCTG CCTTCTCAAA CGGCCGAAGT TGCCTCTTTT AGGAATCTCT 1980 

TTGGAATTGG GAGCACCATG ACTCTGAGTT TGAGCTATTA AAGTACTTCT XACAAA 2036 

SEQ ZD KO:19: 

Feature - 213 GJLy/vAX polymorph 
Met GLn lie Pro Arg Ala Ala Leu Leu Pro Leu Leu Lou Leu Leu Leu 

1 5 10 15 

Ala Ala Pro Ala Ser Ala Gin Leu 9«r Arg Ala Gly Arg Sor Ala Pro 

20 25 30 

Leu Ala Ala Gly Cys Pro Asp Arg Cya Glu Pro Ala Arg Cys Pro Pro 

35 40 45 

Gin Pro Glu Hia Cya Glu Gly Gly Arg Ala Arg Asp Ala Cys Gly Cys 

50 55 60 

Cys Glu Val Cys Gly Ala Pro Clu Cly Ala Ala Cys Gly Leu Gin Glu 
65 70 75 80 

Gly Pro Cys Gly Glu Gly Leu Gin Cys Val Val Pro Phe Gly Val Pro 

85 90 9S 

Ala S«r Ala Thr Val Arg Arg Arg Ala Gin Ala Gly Leu Cys Val Cys 

WO 105 HO 

Ala Sor Ser Glu Pro Val Cys Gly Ser Asp Ala Asn Thr Tyr Ala Asn 

ll 5 120 125 

Leu cys Gin Leu Arg Ala Ala Ser Arg Arg Ser Glu Arg Leu Hia Arg 

130 135 140 

Pro Pro Val Ilo Val Lou Gin Arg Gly Ala Cya Gly Gin Gly Gin Glu 
14S 130 155 160 

Asp Pro Asn Ser Leu Arg His Lys Tyr Asn >he He Ala Asp Val Val 

165 170 175 

Clu Lys He Ala Pro Ala Val Val His lie Glu Leu Phe Arg Lys Leu 

180 185 190 

Pro Phe Ser Lys Aro Glu Val Pro Val Ala Ser Gly Ser Cly Pho II* 

19S 200 205 
Val Ser Glu Asp Xaa Leu He Val Thr Asn Ala His Val Val Thr Asn 
-210 : 215 : 220 



Lys His Arg Val Lys Val Glu Leu Lys Asn Gly Ala Thr Tyr Glu Ala 
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225 230 235 240 

Lys lie Lys Asp Val Asp Glu Lys Ala Asp He Ala Leu lie Lys He 

245 250 255 

Asp His Gin Gly Lys Lea Pro Val Leu Leu Leu Gly Arg Ser Ser Glu 

260 265 270 

Leu Arg Fro Gly Glu Phe Val Val Ala He Gly Ser Pro Phe S«r Leu 

27S 2S0 285 

Gin Am Thr Val Thr The Gly He Val Ser Thr Thr Gin Arg Gly Gly 

290 295 300 

Lya Glu Leu Gly Leu Arg Asn Ser Asp Met Asp Tyr lie Gin Thr Asp 
305 310 315 320 

Ala He lie Asn Tyr Gly Asa Ser Gly Gly Pro Leu Val Asn Leu Asp 

325 330 33S 

Gly'Glu Val He Gly ILe Asa Thr Leu Lys Val Thr Ala Gly He Ser 

340 343 350 

Phe Ala Ha Pro Ser Asp Lys He Lys Lye Pha Leu Thr Glu Ser His 

355 360 ' 365 

Asp Arg Gin Ala Lya Gly Lys Ala He Thr Lys Lys Lys Tyr lie Gly 

370 375 380 

Ile> Arg Met Met Ser Leu Thr Ser Ser Lys Ala Lys Glu Leu Lys Asp 
385 390 395 400 

Arg Ris Arg Asp Ph© Pro Asp Val He Ser Gly Ala Tyr He He Glu 

405 410 415 

Val lie Pro Asp Thr Pro Alt Glu Ala Gly Gly Lea Lys Glu Asn Asp 

420 425 430 

Val Ila He Ser He Asn Gly Gin Sec Val Val Ser Ala Asn Asp Val 

435 440 445 

Ser Asp Val He Lys Arg Glu Ser Thr Leu Asn Ket Val Val Arg Arg 

450 455 460 

Gly Asn Glu Asp II* Met He Thr Val He Pro Glu Glu He Asp Pro 
465 470 475 430 

SEQ ZD WO:lD: 

ATGCTGAACA TCGGGAAAGC TTGGTTCTCG 30 
SEQ ZD KOi20: 

CCAACAGACA ACCGGGCCCA GAGACT 26 
SEQ ZD KO:21: 

TGCCTCCtCG CCCGCCCTAC TCAGA .25 
SEQ ID KG: 22: 
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CGTGGATCCC GAGAAAGAGG CGCAGGACGA GGAGGCAGAA CCCGACTGGC GCGTAGAGCA 60 

GCAGCACGAG CAGTAGGAAG CAGTCACCCG GAAGCCTGGG CGCGAGAGGC GAAGTGGTCA 120 

CGCGCCGAAG GCCGAGAGCA CGCGGGGATC GGTCTCTTCC CGCCGGGTCT CTTACCGGTG 180 

CGAGTCAAAC AGCCGCTCCG GCCCCGGCCC TGAGGGAAGC TCCATAACTG CTGCTTCAGG 24 0 

AGCGCCCGGC CGTCGCCGCC GCCGCCATTT 7CCCCCCCCG CCCCACGCGC TCT7GCGAAG 300 

GCGGAGTCTT TGGGCATCCG CCCGGGGTGA GGGGACCCGA AGTCCTGAGG CGCGCCGGAA 350 

GGGCTAGCGG TCCCAGCATA OCCCGCGGCC CCTTGGGCCG TCTCACAACT CGCGTCCGGC 420 

GGAGACCACA ATTCCCGGCA TTCGTGGGGC AGGGAGGAGT CGGCCTCCCG GAATCC7GGT 480 

CCCGGCGTGC ACTTCTGAAG GACTTCAGGT ACCGGCGTGC CCCGCGTCCT ACTGTCCGCC 54 0 

TGCTCGCGTC CTGGGTGCCG CCTCTGAGTA GGGCGGGCGA GGAGGCA 537 

SEQ ZD NO: 23: 

CGTGGATCCC GAGAAAGAGG CGCAGGACGA GGAGGCAGAA CCCGACTGGC GCGTAGAGCA 60 

GCAGCACGAG CAGTAGGAAG CAGTCACCCG GAAGCCTGGG GGCGAGAGGC GAAGTGGTCA 120 

GGCGCCGAAG GCCGAGAGCA CGCGGGGATC GGTCTCTTCC CGCCGGGTCT CTTACCGGTG 180 

CGAGTCAAAG AGCCGCTCCG GCCCCGGCCC TGAGGGAAGC TCCATAACTG CTGCTTCAGG "240 

AGCGCCCGGC CGTCGCCGCC GCCGCCATTT TCGCGCCCGG CCGCAGGGGC TCTTGGGAAG 300 

GCGGAGTCTT TGGGCATCCG CCCGGGGTGA GGGGACCCGA AGTCCTGAGG CGCGCCGGAA 360 

GGGCTAGCGG TCCCAGCATA CCCCGCGGCC CCTTGGGCCG TCTCACAACT CGCGTCCGGC 420 

GGAGACCACA ATTCCCGGCA TTCGTGGGGC AGGGAGGAGT CGGCCTCCCG GAATCCTGGT 4 80 

CCCGGCCTGC ACTTCTGAAG GACTTCAGGT ACCGGCGTGC CCCGCGTCCT ACTGTCCGCC 5 40 

TGCTCGCGTC CTGGGTGCCG CCTCTGAGTA GGGCGGGCGA GGAGGCAGCC AAGGCGGAGC 600 

TG ATG GCT GCG CCG AGG GCG GGG CGG GGT GCA GGC TGG AGC CTT CGG 64 7 
Met Ala Ala Pro Arg Ala Gly Arg Gly Ala Gly Trp Ser Leu Arg 
1 5 10 15 

GCA TGG CGG GCT TTG GGG GGC ATT TGC TGG GGG AGG AGA CCC CGT TTG . 695 
Ala Trp Arg* Ala Leu Gly Gly lie Cys Trp Gly Arg Arg Pro Arg Leu 
20 25 30 

ACC CCT GAC CTC CGG GCC CTG CTG ACG TCA GGA ACT TCT GAC CCC CGG 743 
Thr Pro Asp Leu Arg Ala Leu Leu Thr Ser Gly Thr Ser Asp Pro Arg 

35 40 45 

•%■ 

GCC CCA GTG ACT TAT GGG ACC CCC AGT CTC TGG GCC CGG TTG TCT GTT 791 
Ala Arg Yal Thr Tyr Gly Thr Pro Ser Leu Trp Ala Arg Leu Ser Val 
50 55 60 

GGG GTC ACT GAA CCC CGA GCA TGC CTG ACG TCT GGG ACC CCG GGT CCC 839 
Gly Val Thr Glu Pro Arg Ala Cys Leu Thr Ser Gly Thr Pro Gly Pro 
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CGG GCA CAA CTG ACT GCG GTG ACC CCA GAT ACC AGG ACC CGG GAG GCC 
Arg Ala Cln Lou Thr Ala Val The Pro Aap Thr Arg Thr Arg Glu Ala 
80 * 85 90 95 

TCA GAG *AC TCT GCA ACC CGT TCG CCC GCC TGG CTG GCG GTG GCG CTG 
Ser GLu Asm Ser Gly Thr Arg Ser Arg Ala Trp Lou Ala Val Ala Lou 
100 .105 HO 

GGC GCT " GGG GGG GCA GTG CTG TTG TTG TTG TGG GGC GGG GGT CGG GGT 
Gly Ala Gly Gly Ala Val Leu Leu Leu Leu Trp Gly Gly Gly Arg Qly 
115 120 125 

CCT CCG GCC GTC CTC GCC GCC GTC CCT ACC CCC CCG CCC GC7 TCT CCC 
Pro Wo Ala Val Leu Ala Ala Val Pro 3«r Pro Pro Pro Ala Ser Pro 
130 135 140 

CGG AGT CAG TAC AAC TTC ATC GCA GAT GTG GTG GAG AAG ACA GCA CCT 
Arg Ser Gin tyr Asn Phe lie Ala Asp Val Val Glu Lya Thr Ala Pro 
US ^ 150 155 

GCC GTG GTC TAT ATC GAG ATC CTG GAC CGG CAC CCT TTC TTG GGC CGC 
Ala Val Val Tyr lie Glu He Leu Aap Arg Hla Pro Phe Leu Gly Arg 
ISO 165 1*70 175. 

GAG GTC CCT ATC TCG AAC GGC TCA GGA TTC GTG GTG GCT * GCC GAT GGG' 
Glu Val Pro lie Ser Asn Gly Ser Gly Phe Val Val Ala Ala Asp Gly 
180 185 190 

CTC ATT GTC ACC AAC GCC CAT GTG GTG GCT GAT CGG CGC AGA GTC CGT 
Leu lie Val Thr Aan Ala Hia Val Val Ala Aap Arg Arg Arg Val Arg 
195 200 205 

GTG AGA CTG CTA AGC GGC CAC ACG TAT GAG GCC GTG GTC ACA GCT GTG 
Val Arg Leu Lau Ser Gly Asp Thr Tyr Glu Ala Val Val Thr Ala Val 
' 210 215 * 220 - 

GAT CCC GTG GCA GAC ATC GCA ACG CTG AGG ATT CAG ACT AAG GAG CCT 
Aap Pro Val Ala Aap He Ala Thr Leu Arg lie Gin Thr Lys Glu. Pro 
225 230 235 

CTC CCC ACG CTG CCT CTG GGA CGC TCA GCT GAT GTC CGG CAA GGG GAG 
Leu Pro Thr Leu Pro Lou Cly Arg Sor Ala Asp Val Arg Cln Gly Glu 
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837 



335 



983 



1031 



1079 



1127 



1175 



1223 



1271 



1319 



1367 
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2*0 245 250 



255 



TTT GTT GTT GCC ATG GCA AGT CCC TTT GCA CTG CAG AAC ACG ATC ACA 1415 
Phe Val Val Ala Met Gly Ser Pro Phe Ala Leu Gin Asn Thr lie Thr 
260 265 270 

TCC CGC ATT GTT AGC TCT GCT CAG CGT CCA GCC AGA GAC CTG GGA CTC 14 63 
Ser Gly lie Val Ser Ser Ala Gin Arg Pro Ala Kcq Asp Leu Gly Leu 
275 280 295 

CCC CAA ACC AAT GTG GAA TAC ATT CAA ACT GAT GCA GCT ATT GAT TTT 1511 
Pro Gin Thr Asn Val Glu Tyr Tie Gin Thr Asp Ala Ala II© A3p Phe 
290 295 300 

GGA AAC TCT GGA GGT CCC CTG GTT AAC CTG GAT GGG GAG GTG ATT GCA 1559 
Gly Asn Ser Gly Gly Pro Lou Val Asn Leu Asp Gly Glu Val He Gly 
305 _ 310 3is 

GTG AAC ACC ATG AAC GTC ACA GCT GGA ATC TCC TTT GCC ATC CCT TCT 1607 
Val Asn Thr Met Lys Val Thr Ala Gly lie Ser Phe Ala He Pro Ser 
320 325 330 # ' 335 

GAT CGT CTT CGA GAG TTT CTG CAT CGT GGG GAA AAG AAG AAT TCC TCC 1655 
Asp Arg Leu Arg Glu Phe Leu His Arg Gly Glu Lys Lys Asn Ser Ser 
* 340 345 350 

TCC GGA ATC AGT GGG TCC CAG CGG CGC TAC ATT GGG GTG ATG ATG CTG 1703 
Ser Gly lie Ser Gly Ser Gin Arg Arg Tyr He Gly Val Met Hot Leu 
355 360 365 

ACC CTG AGT CCC AGC ATC CTT GCT GAA CTA CAG CTT CGA GAA CCA AGC 1751 
Thr Lau Ser Pro Ser lie Leu Ala Glu Leu Gin Leu Arg Glu Pro Ser 
370 375 380 

■* 

TTT CCC GAT GTT CAG CAT GGT GTA CTC ATC CAT AAA GTC ATC CTG GGC 1799 
Phe Pro Asp Val Gin His Gly Val Leu lie His Lys Val Tlo Leu Gly 
335 390 395 

TCC CCT GCA CAC CGG GCT CGT CTG CGG CCT GGT GAT GTG ATT TTG GCC 184 7 
Ser Pro Ala His Arg Ala Gly Leu Arg Pro Gly Asp Val Xle Leu Ala 
400 405 ■ ' 410 415 
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ATT GGG GAG CAG ATG CTA CAA AAT GCT GAA GAT GTT TAT GAA GCT GTT 1B95 
lie Gly Glu Gin Met Val Gin Asn Ala Glu Asp Val Tyr Glu Ala Val 
420 425 430 



CGA ACC CAA TCC CAG TTG GCA GTG CAG ATC CGG CGG GGA CGA GAA ACA 1343 
Axg Thr Glh Ser Gin Leu Ala Val Gin lie Arg Arg Gly Arg Glu Thr 
435 440 445 

CTG ACC TTA TAT CTC ACC CCT CAG GTC ACA GAA TG AAT AG ATC ACCAAGAGTA 1936 
Leu Thr Leu Tyr Val Thr Pro Glu Val Thr Glu 
450 . 45S 

TGAGGCTCCT CCTCTGATTT CC7CCTTGCC TTTCTGGCTG AGGTTCTGAG GGCACCGAGA 2056 

CAGAGGGTTA AATGAACCAG TGGGGGCAGG TCCCTCCAAC CACCAGCACT GACTCCTGGG 2116 

CTCTGAAGAA TCACAGAAAC ACTTTTTATA TAAAATAAAA TTATACCTAG CAACATAAAA 2176 

AAAAAAAAAA A 2187 

SEQ XZ> H0T24: 

CGTGGATCCC GAGAAAGAGG CGCAGGACGA GGAGCCAGAA CCCGACTCCC GCGTAGAGCA 60 
GCAGCACGAG CAGTAGGAAG CAGTCACCCG GAAGCCTGGG GGCGAGAGGC" CAACTGGTCA 120 
GGCGCCGAAG GCCGAGAGCA CGCCGGGATC GGTCTCTTCC CGCCGGGTCT CTTACCOSTG 180 
CGAGTCAAAG AGCOGCTCCG GCCCCGGCCC TGAGGGAAGC TCCATAACTG CTGCTTCAGG 240 
AGCGCCCGGC CGTCGCCGCC GCCCCCATTT TCGCCCCCGG CCGCAGGGGC TCTTGGGAAG 300 
GCGGAGTCTT TGGGCATCCG CCCGGGGTGA GGGGACCCGA AGTCCTGACG CGCGCCGGAA 360* 

GGGC7AGCGG TCCGAGCATA CCCCGCCCCC CCTTGGCCCG TCTCACAACT CGCGTCCGGC 4 20 
GGAGACCACA ATTCCCGGCA TTCGTGGGGC AGGGAGGAGT CGGCCTCCCG GAATCCTGGT 480 

CCCGGCCTGC ACTTCTGAAG GACTTCAGGT ACCGGCGTGC CCCGCGTCCT ACTGTCCCCC 540 
TCCTCGCCTC CTGGGTGCCG CCTCTGAGTA GGGCGGGCGA GGAGGCAGCC AAGGCGGAGC 600 

TG ATG GCT GCG CCG AGG CCG GGG CGG GGT GCA GGC TGG AGC CTT CGG 64 7 
Met Ala Ala Pro Arg Ala Gly Arg Gly Ala Gly Tip Ser Leu Arg 
1' * 5 10 15 



GCA TCC CGG GCT TTG GGG GGC ATT CGC TGG GGG AGG AGA CCC CGT TTG 
Ala Txp Arg Ala Leu Gly Gly II* Arg Trp Gly Arg Arg Pro Arg Leu 
20 25 .30 



695 



ACC CCT GAC CTC CGG GCC CTG CTG ACG TCA GGA ACT TCT GAC CCC CGG 
Thr ?ro Asp Leu. Arg Ala Leu Leu Thr Ser Gly Thr Ser Aap Pro Arg 
35 40 45 



743 



GCC CGA GTG ACT TAT GGG ACC CCC AGT CTC TGG GCC CGG TTG TCT GTT 
Ala Arg Val Thr Tyr Gly Thr Pro Ser Leu Trp Ala Arg Leu Ser Val 



791 
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50 . 55 60 

GGG GTC ACT GAA CCC CGA GCA TGC CTG ACG TCT GGG ACC CCG GGT CCC 839 
Gly Val Thr Glu Pro Arg Ala Cya Leu Thr Sar Gly Thr Pro .Gly Pro 
65 . 70 75 

CGG GCA CAA CTG ACT GCG GTG ACC CCA GAT ACC AGG ACC CGG GAG GCC 887 
Arg Ala Gin Leu Thr Ala Val Thr Pro Asp Thr Arg Thr Arg Glu Ala 
80 85 jo 95 

TCA GAG AAC TCT GGA ACC CGT TCG CGC GCG TGG CTG GCG GTG GCG CTG 935 
Ser Glu Aan Ser* Gly Thr Arg Ser Arg Ala Trp Leu Ala Val Ala Leu 
100 105 110 

CGC GCT GCG CCG CCA GTG CTG TTG TTG TTG TGG GGC GGG GGT -CGG GGT 983 
Gly Ala Gly Gly Ala Val Leu Leu Leu Leu Trp Gly Gly Gly Arg Gly 
115 120 125 

CCT CCG GCC GTC CTC GCC GCC GTC CCT AGC CCG CCG CCC GCT TCT CCC 1031 
Pro Pro Ala Val Leu Ala Ala Val Pro Ser Pro Pro Pro Ala Ser Pro 
130 135 140 

CGG AGT CAG TAC AAC TTC ATC GCA GAT GTG GTG GAG AAG ACA GCA CCT 1079 
Arg Ser Gin Tyr Aon Phe He Ala A*p Val Val Glu Ly* Thr Ala Pro 
145 150 1S5 

GCC GTG GTC TAT ATC GAG ATC CTG GAC CGG CAC CCT TTC TTG GGC CGC 1127 
Ala Val Val Tyr He Glu He Leu Asp Arg His Pro Phe Lou Gly Arg 
160 165 l70 1?5 

GAG GTC CCT ATC TCG AAC GGC TCA GGA TTC GTG GTG GCT GCC GAT GGG 1175 
Glu Val Pro He Ser A$n Gly Ser Gly Phe Val Val Ala Ala Asp Gly 
180 185 igo 

CTC ATT GTC ACC AAC GCC CAT CTG GTG GCT GAT CGG CGC AGA GTC CGT 1223 
Lea He Val Thr Asn Ala His Val Val Ala Asp Arg Arg Arg Val Arg 
!95 200 203 

GTG AGA CTG CTA AGC GGC GAC ACG TAT GAG GCC GTG GTC ACA GCT GTG 1271 
Val Arg Leu Leu Ser Gly Asp Thr Tyr Glu Ala Val Val Thr.fUa Val 1 
210 215 220 
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GAT CCC GTG CCA CAC ATC GCA ACG CTG AGG ATT CAG ACT AAG .GAG CCT 1319 
Asp Pro Val Ala Asp lie Ala Thr Leu Arg lie Gin Thr Lys Glu Pro 
225 230 235 

CrC CCC ACG CTG CCT CTG GGA CGC TCA GCT GAT GTC CGG CAA GGG GAG 1367 
Leu Pro Thr Leu Pro Leu Gly Arg Ser Ala Asp Val Arg Gin Gly Glu 
240 243 250 255 

rrr gtt gtt gcc atg gga agt ccc ttt gca ctg cag aac acg. atc aca his 

Phe Val Val Ala Met Gly Ser Pro Phe Ala Leu Gin Asn Thr lie Thr 
260 265 270 

TCC GGC ATT GTT AGC TCT GCT CAG CGT CCA GCC AGA GAC CTG GGA CTC 14 63 
Ser Gly He Val Ser Ser Ala Gin Arg Pro Ala Arg Asp Leu Gly Leu 
275 230 28S 

CCC CAA ACC AAT GTG GAA TAG ATT CAA ACT GAT GCA GCT ATT GAT TTT 1511 
Pro Gin Thr Aan VaJL Glu Tyr lie Gin Thr Asp Ala Ala He Asp Phe 

.230 * 295 . - 300 

GGA AAC TCT. GGA GCT CCC CTG GTT AAC CTC GAT GGG GAG GTG ATT GGA 1559 
Gly Asn Ser Gly Gly Pro Leu Val Asn Leu Asp Gly Glu Val He Gly 
305 310 315 

GTG AAC ACC ATG AAG GTC ACa'gCT GGA ATC TCC TTT GCC ATC CCT TCT 1607 
Val Asn Thr Met Lys Val Thr Ala Gly He Ser Phe Ala He Pro Ser 
320 325 330 333 

GAT CGT CTT CGA GAG* TTT CTG CAT CGT GGG GAA AAG AAG AAT TCC TCC 1655 
Asp Arg Leu Arg Glu Phe Leu Mis Arg Gly Glu Lys Lys Asn Ser Ser 
340 345 350 

TCC GCA ATC AGT GGG TCC CAG CGG CGC TAC ATT GGG GTG ATG ATG CTG 1703 
Ser Gly He Ser Gly Ser Gin Arg Arg Tyr He Gly Val Met Hat Leu 
355 360 -> 365 

ACC CTG AGT CCC AGC ATC CTT GCT GAA CTA CAG CTT CGA GAA CCA AGC 1751 
Thr Leu Ser Pro Ser He Leu Ala Glu Leu Gin Leu Arg Glu Pro Ser 
370 315 330 

TTT CCC GAT GTT. CAG CAT GGT GTA CTC ATC CAT AAA GTC ATC CTG GGC 1799 
Phe Pro Asp Val Gin Hie Gly Val Leu He Kis Lys Val He Leu Gly 
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365 390 395 

TCC CCT GCA CAC CGG GCT GGT CTC CGG CCT GCT CAT CTC ATT T7C CCC 1847 
Ser Pro Ala His Arg Ala Gly Leu Axg Pro Gly Asp Val lie Leu Ala 
400 . 405 410 415 

ATT GGG GAG CAG ATG GTA CAA AAT GCT GAA GAT GTT TAT GAA GCT GTT 1893 
lie Gly Glu Gin Met Val Gin Aan Ala Glu Asp Val Tyr Glu Ala Val 
420 425 430 

CGA ACC CAA TCC CAG TTG GCA GIG CAG ATC CGG CGG GGA CGA GAA ACA 1943 
Arg Thr Gin Ser Gin Leu Ala Val Gin lie Arg Arg Gly Arg Glu Thr 
435 440 445 

CTC ACC TTA TAT GTG ACC CCT GAG GTC ACA GAA TGAATAGATC ACCAAGAGTA 1996 
Leu Thr Leu Tyr Val Thr Pro Glu Val Thr Glu 
450 455 

TGAGGCTCCT GCTCTGATTT CCTCCTTGCC TTTCTGGCrG AGGTTCTGAG GGCACCGAGA 2056 

CAGAGGGTTA AATGAACCAG TGGGGCCAGG TCCCTCCAAC CACCAGCACT GACTCCTGGG 2116 

CTCTGAAGAA TCACAGAAAC ACTTTTTATA TAAAATAAAA TTATACCTAG CAACATAAAA 2176 

AAAAAAAAAA A* 2187 

SEQ m NO:25: 

Met Ala Ala Pro Arg Ala Gly Arg Gly Ala Gly Trp Ser Lou Arg Ala 
1 5 10 15 

Trp Arg Ala Leu Gly Gly lis Arg Trp Gly Acg Arg Pro Arg Leu Thr 

20 25 30 

Pro Asp Leu Arg Ala Leu Leu Thr Ser Gly Thr Ser Asp Pro Arg Ala 

35 40 45 

Arg Val Thr Tyr Gly Thr Pro ser Leu Trp Ala Arg Leu Ser Val Gly 

SO 55 60 

Val Thr Glu Pro Arg Ala Cys Leu Thr Ser Gly Thr Pro Gly Pro Arg 
65 70 75 80 

Ala Gin Leu Thr Ala Val The Pro Asp Thr. Arg Thr Arg Glu Ala Ser 

85 90 95 

Glu Asn Ser Gly Thr Arg Ser Arg Ala Trp Leu Ala Val Ala Leu Gly 

100 105 110 

Ala Gly Gly Ala Val Leu Lau Leu Lau Trp Gly Gly Gly Arg Gly Pro 

115 120 .12 5 

Pro Ala Val Leu Ala Ala Val Pro Ser Pro Pro Pro Ala Ser Pro Arg 
130 135 140 
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Ser Gin Tyr Asn Phe He Ala Asp Val Val Glu Lys Thr Ala Pro Ala 
K* ISO 155 160 

Val Val Tyr He Glu He Leu Asp Arg His Pro Pha Leu Gly Arg Glu 

165 170 175 

Yal Pro tie Ser Asn Gly Ser Gly Phe Val Val Ala Ala Asp Gly Leu 

180 185 190 

He Val Thr Asn Ala His Val Val Ala Asp Arg Arg Arg Val Arg Val 

195 2Q0 205 

Arg Leu Leu Ser Gly Asp Thr Tyr GlU Ala Val Val Thr Ala Val Asp 

210 215 220 

Pro Val Ala Asp He Ala Thr Leu Arg lie Gin Thr Lys Glu Pro Leu 
225 230 235 240 

Pro Thr Leu Pro Leu Gly Arg Ser Ala Asp Val Arg Gin Gly Glu Phe 

245 250 555 

Val V»l Ala Mot Gly 3er Pro Phe Ala Leu Gin Asn Thr lie Thr Ser. 

250 26S 270 

Gly He Val Ser Ser Ala Gin Arg Pro Ala Arg Asp Leu Gly Leu Pro 

. 275 280 285 

Gin Thr Asa Val Glu Tyr He Gin Thr Asp Ala Ala He Asp Phe Gly 

290 295 300. 

Asn Ser Cly Gly Pro Lau Val Aan Leu Asp Gly Glu Val He Gly Val 
305 310 315 320 

Aan Thr Mat Lys Val Thr Ala Gly He Ser Phe Ala He Pro Ser Asp 

325 330 335 

Arg Leu Arg Clu Pho Lou Kio Arg Gly Glu Lys Lys Asn Ser Ser Ser 

340 343 330 

Gly He Ser Gly Sar Gin Arg Arg Tyr lie Gly Val Mat Met Leu Thr 

355 -360 365 

Leu Ser Pro Ser He Leu Ala Glu Leu Gin Leu Arg Glu Pro Ser Ph* 

370 375 380 

Pro Asp Val Gin His Gly Val Lqu lie His Lys Val He Leu Cly Ser 
385 -390 395 . 400 

Pro Ala Hla Arg Ala Gly Leu Atg Pro Gly Asp Val He Lau Ala He 

■405 410 415 

Gly Glu Gin Met Val Gin Asn Ala Glu Asp Val Tyr Glu Ala Val Arg 

420 425 ** 430 

Thr flln Ser Gin Leu Ala Val Gin He Arg Arg Gly Arg Glu Thr Leu 

435 440 445 

Thr Leu Tyr Val The Pro Glu Val Thr Glu 
450 455 

SEQ ID SO:26: 
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CG7GGATCCC GAGAAAGAGC CGCAGGACGA GGAGCCAGAA CCCCACTGGC GCGTAGACCA €0 

GCAGCACGAG CAGTAGGAAG CAGTCACCCG GAACCCTGGG GGCGAGAGGC GAAGTGGTCA 120 

GGCGCOGAAG GCCGAGAGCA CGCGGGGATC GGTCTCTTCC CGCCGGGTCT CTTACCGGTG 180 

CCAGTCAAAG AGCCCCTCCG GCCCOGGCCC TGAGGGAAGC TCCATAACTG CTGC7TCACG 240 

AGCGCOCGGC CGTCGCCGCC GCCGCCATTT TCGCGCCGGG CCGCAGGGGC TCTTCCGAAG 300 

GCGGAGTCTT. TGGGCATCCG CCCGGGGTGA GGGCACCOGA AGTCCTGAGG CGCGCCGGAA 360 

GGC-CTAGCGG TCCCAGCATA CCCCGCGGCC CCTTGGGCCG TCTCACAACT CGCCTCCGGC 420 

CGAGAOCACA ATTCCCCGCA TTCGTGGGGC AGGGAGGAGT CGGCCTCCCG GAATCCTGGT 480 

CCCGGCGTGC ACTTCTGAAG GACTTCAGGT ACCGGCGTGC CCCGCGTCCT ACTGTCCGCC .540 

TGCTCGCGIC CTGGGTGCCG CCTCTGAGTA GGGCGGGCGA GGAGGCAGCC AAGGCGGAGC 600 

TG ATG GCT GCG CCG AGG GCG GGG CGG CGT GCA GGC TGG AGC CTT CGG 647 
Met Ala Ala Pro Arg Ala Gly Arg Gly Ala Gly Trp Ser Leu Arg 
1 5 10 1S 

GCA TGG CGG GCT TTG GGG GGC ATT CGC TGG GGG AGG AGA CCC CGT TTG 695 
Ala Trp Arg Ala Lou Gly Gly II* Arg Trp Cly Arg Arg Pro Arg Leu 
20 23 30 

ACC CCT GAC CTC CGG GCC CTG CTG ACG TCA GGA ACT* TCT GAC CCC CGG 743 
Tiir Pro Asp Leu Arg Ala Leu Leu Thr Ser Gly Thr Ser Asp Pro Arg 
35 40 45 

GCC CGA GTG ACT TAT GGG ACC CCC AGT CTC TGG GCC CGG TTG TCT GTT 791 
Ala Arg Val Thr Tyr Gly Thr Pro. Ser Leu Trp Ala Arg Leu Ser VaI 
50 55 60 



CGG CTC ACT CAA CCC CGA GCA TGC CTG AOG TCT GGG ACC CCG GGT OOC 83 9 

Gly val Thr Glu Pro Arg Ala Cya Leu Thr Ser Gly Thr Pro Gly Pro 
65 70 75 

CGG GCA CAA CTG ACT GCG GTG ACC CCA GAX ACC AGG ACC CGG GAG GCC 867 
Arg Ala Gin Leu Thr Ala Val Thr Pro Asp Thr Arg Thr Arg Glu Ala 
60 85 90 95 

TCA GAG AAC TCT- GGA ACC CGT TCG CGC CCG TGG CTG GCG GTG GCG CTG 935 
S«r Glu Asn Ser Gly Thr Arg Ser Arg Al* Trp Leu Ala Val Ala Leu 
100 105 110 



GGC GCT GGG GGG GCA GTG CTG TTG TTG TTG TGG GGC GGG GGT CGG GGT ' 983 
Cly Ala Cly Cly Ala Val Leu Leu Leu Leu Trp Gly Gly Gly Arg Gly 
115 120 125 
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CCT CCG GCC GTC CrC CCC GCC GTC CCT AGC CCC CCC CCC GCT TCT OCC 
Pro Pro Us Val Leu Ma Ala Val Pro Ser Pro Pro Pro Ala Ser Pro 
130 !35 140 



1031 



CGG ACT CAG TAC AAC TTC ATC GCA GAT GTG GIG GAG AAG ACA GCA CCT X079 
Arg Ser Gin Tyr Asn Pha Ha Ala Asp Val Val Clu Lya Thr Ala Pro 
145 130 135 

GCC GTG GTC TAT ATC GAG ATC CTG GAC CGG CAC CCT TTC TTG GGC CGC 1127 
Ala Val Val Tyr He Clu He Leu Asp Arg His Pro Phe Leu Gly Arg 
160 US 170 17 5 

GAG GTC CCT ATC TCG AAC GGC TCA GGA TTC GTG GTG GCT GCC CAT GCC 1115 
Glu Val Pro Il« S«r Asn Gly Ser Gly Phe Val Val Ala Ala Asp Gly 
180 185 190 

CTC ATT GTC ACC AAC GCC CAT GTG GTG GCT GAT CGG CGC AGA GTC CGT 1223 
Leu Il» Val Thr Aan Ala Hla Val Val Ala Asp Arg Arg Arg Val Arc 
193 200 20S - 

GTG AGA CTG CTA AGC GGC GAC ACG TAT GAG GCC GTG GTC ACA GCT GTG 1271 
Val Arg Leu Leu Ser Gly Asp Thr Tyr Glu Ala Val Val Thr Ala Val 
210 215- 220 

GAT CCC GTG GCA GAC ATC GCA ACG CTC AGG ATT CAG ACT AAG GAG CCT 1319 
Aap Pro Val Ala Asp He Ala Thr Leu Arg He Gin Thr tys Clu Pro 
225 230 235 

CTC CCC ACG CTG CCT CTG GGA CCC TCA GCT GAT GTC CGG CAA GGG GAG 1367 
Leu Pro Thr Leu Pro Leu Gly Arg Ser Ala Asp Val Arg Gin Gly Clu 
240 245 2S0 2S5 

TTT GTT CTT CCC ATG GGA AGT CCC TTT GCA CTG CAG AAC ACG ATC ACA 1413 
Phe Val Val Ala Met Gly Ser Pro Pha Ala Leu Gin Aan Thr Il« Thr 
260 . 265 -> .270 

TCC CCC ATT GTT AGC TCT GCT CAG CGT CCA GCC AGA GAC CTG GGA CTC 1463 
Ser Gly He Val Ser Ser Ala Gin Arg Pro Ala Arg Asp Lou Cly leu 
275 . 280 285 

CCC CAA ACC AAT GTG GAA TAC ATT CAA ACT GAT GCA GCT ATT GAT TTT 1511 
Pro Gin Thr Aan Val Glu Tyr He Gin Thr Asp Ala Ala He Asp ?he 
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290 295 300 

GGA ARC TCT GGA GGT OCX CTG GTT AAC CTG GTG AGT GAG ACA TCC TTC L559 
Gly Asn Ser Gly Gly Pro Leu Vol Asa Leu Val Ser Glu Thr See Phe 
305 . 310 315 

C7T CCA AG A. ATC CCT GCC CCA GGT CAG TGT GGG AAG GGT AGG TTT CCC " L607 
Leu Pro Arg He Pro Ala Pro Gly GLn Cys Gly Lys Gly Arg Pho Pro 
320 325 330 335 

CTA ATT CAA GGA TGT TIG GTC AAG TTT CTG AGC AGT TCT TTG TTG GCT 1655 
Leu He Gin Gly Cys Lau Val Lys Pha Lau Sar 5«x Sar Lou l«u Ala 
340 345 350 

ATC TCT CAA TAT CCA ACC AGA TCT CCC CAA CAC TTG CTG GTA CTT TTG 1703 
IIq Ser Gin Tyr Pro The Arg Ser Pro Gin His Leu Leu Val Leu Leu 
333 360 365 

TTC GGG TGC CCC CAT CCC CTA CTA TTT GTT TAGCCTAGGG AACTGGGGGC TGT A 1757 
Phe Gly Cys Pro His Pro Leu Leu Phe Val 
370 375 

TCCCTGCAGG ATGGGGAGGT GATTGGAGTG AACACCATGA AGGTCACAGC TGGAATC7CC 1817 

TTTQCCATCC CTTCTGATCG TCTTCCAGAC TTTCTCCATC CTGCGCAAAA GAAGAATTCC 1877 

TCCTCCGGAA TCAGTGGGTC CCAGCGGCGC TACATTGGGG TGATGATGCT GACCCTGAGT 1937 

CCCAGCATCC TTGCTGAACt ACAGC7TCGA GAACCAAGCT TTCCCGATGT TCAGCATGGT 1997 

GTACTCATCC ATAAAGTCAT CCTGGGCTCC CCTGCACACC GGGCTGGTCT GCGGCCTGGT 2057 

GATGTGATTT TCGCCATTGG GGAGCAGATG GTACAAAATG CTGAAGATGT TTATGAAGCT 2117 

GTTCGAACCC AATCCCAGTT GGCAG7GCAG ATCCGGCGGG GACGAGAAAC ACTGACCTTA 2177 

TATGTGACCC CTGAGGTCAC AGAATGAATA GATCACCAAG AGTA7GAGGC TCCTGC7CTG 2237 

ATTTCCTCCT TGCCTTTCTG GCTGAGGTTC TGAGGGCACC GAGACACAGG GTTAAATGAA- 2297 

CCAGTGGGGG CAGGTCCCTC CAACCACCAC CACTGACTCC TGGGC7CTGA AGAATCACAG 2357 

AAACACTTTT TATATAAAAT AAAATTATAC CTACCAACAT ATTA7ACTAA AAAATCACGT 2417 

GGGAGGGCTG GATCTTTTCC CCCACCAAAA GGCTAGAGGT AAAGC7GTAT CCCCC7AAAC 2477 

TTAGGGGAGA TACTGGAGCT G ACC AT CCT G ACCTCCTA?T AAAGAAAATG AGCTGCTGAA 2537 

AAAAAAAAAA AAAA 2551 

3EQ ID WO:27: 

Mot Ala Ala Pro Arg Ala Gly Arg Gly Ala Gly Trp Ser Leu Arg Ala 
1 S 10 15 

Trp Arg Ala Leu Gly Gly lie Arg Trp Gly Arg Arg Pro Arg Leu Thr 

20 25 * 30 - 
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Pro Asp Lou Arg Ala Leu Leu Thr 3er Gly Thr Ser Asp Pro Arg Ala 

35 " AO 45 

Arg Val Thr Tyr Gly Thr Pro Sar Leu Trp Ala Arg Leu Ser Val Gly 

50 55 '60 

Val Thr Glu Pro Arg Ala Cys Leu Thr Sar Gly Thr Pro Gly Pro Arg 
€5 70 75 80 

Ala Gin Leu Thr Ala Val Thr Pro Aap Thr Arg Thr Arg Glu Ala Ser 

85 90 95 

Glu Aan Ser Gly Thr Arg Ser Arg Ala Trp Leu Ala Val Ala Leu Gly 

100 " 105 HO 

Ala Gly Gly Ala Val Leu Leu Leu Leu Trp Gly Gly Gly Arg Gly Pro 

115 120 125 

Pro Ala val Leu Ala Ala Val Pro Ser Pro Pro Pro Ala Ser Pro Arg 

130 135 140 

Ser Gin Tyr Asn "Phe He Ala Asp Val Val Glu Lys Thr Ala Pro Ala 
145 150 155 160 

Val Val Tyr lie Glu He Leu Aap Arg Hia Pro Phe Leu Gly Arg Glu 

165 170 175 

Val Pro He Ser Aan Gly Ser Gly Phe Val Val Ala Ala Asp Gly Leu 

180 185 190 

He Val Thr Aan Ala His val Val Ala Asp Arg Arg Axg Val Arg Val 

135 200 205 

Arg Leu Leu Ser Gly Asp Thr Tyr Glu Ala Val Val Thx Ala Val Aap 

210 215 ' 220 

Pro Val Ala Asp lie Ala Thr Leu Arg He Gin Thr Lys Glu Pro Leu 
225 230 235 240 

Pro Thr Leu Pro Leu Gly Arg Ser Ala Aap Val Axg Gin Gly Glu Phe 

245 250 • 25S 

v«l Val Ala Met Gly Ser Pro Phe Ala Leu Gin Aan 7hr lie Thr Ser 

260 265 ' 270 

Gly He Val Ser Ser Ala Gin Arg Pro Ala Arg Asp Leu Gly Leu Pro 

275 280 285 

Gin Thr Asn Val Glu Tyr He Gin Thr Aap Ala Ala He Asp Phe Gly 

290 295 V 300 

Asn Ser Gly Gly Pro Leu Val Asn Leu Val Ser Glu Thr Ser Phe Leu 
30S 310 315 320 

Pro Arg Jle Pro Ala Pro Gly Gin Cys Gly Lys Gly Arg *he Pro Leu 

325 330 33S 

He Gin Gly Cya Leu Val Lys Phe Leu Ser Ser Ser Leu Leu Ala He 

340 345 350 

Ser Gin Tyr Pro Thx Arg Ser Pro Gin Hia Leu Leu Val Leu Leu Phe 

355 360 3~6S 



59 



(117) 



&ffl¥-l 0-117789 



Gly Cys Pro His Pro leu Lau Pho Val 
370 375 

SSQ TO HO:28: 

CGTGGATCCC GAGAAAGAGG CCCAGGACGA GGAGGCAGAA CCCGACTGGC GCGTAGAGCA 60 

GCACCACGAG CAGTACGAAC CAGTCACCCG GAAGCCTGGG GGCGAGAGGC GAAGTGGTCA 120 

GGCGCCGAAG GCCGAGAGCA CGCGGGGATC GGXCTCTTCC CGCCGGGTCT CTTACCGGTG 180 

CGAGTCAAAG AGCOGCTCCG GCCCCGGCCC TGAGGGAAGC TCCATAACTG CTCCTTCACC 2 40 

AGCGCCCGGC CGTOGCCGCC CCCGCCATTT TCGCGCCCGG CCGCAGGGGC TCTTGGGAAG 300 

GCGGAGTCTT TGGGCATCCG CCCGGGGTGA GGGGACCCGA AGTCCTGAGG CGCGCCGGAA 360 

GGGCTAGCGG TCCCAGCATA CCCCGCGGCC CCTTGGGCCG TCTCACAACT CGCGTCCGGC 4 20 

GGAGACCACA ATTCCCGGCA TTCGTGGGGC AG G GAG GAG T CGGCCTCCCG GAATCCTGGT 480 

CCCGGCGTGC. ACTTCTGAAG GACTTCAGGT ACCGGCGTGC CCCGCGTCCT ACTGTCCGCC 540 

TGCtCGCGTC CTGGGTGCCG CCTC7GAGTA GGGCGGGCGA GGAGGCAGCC AAGGCGGAGC 600 

TG ATG GCT GCG CCG AGG GCG GGG CGG GGT GCA GGC TGG AGC CTT CGG €41 
Met Ala Ala Pro Arg Ala Gly Arg Gly Ala Gly Trp Ser Leu Arg 
1 . 5 10 15 

CCA TGG CGG GCT TTG GGG GGC • ATT CGC TGG GGG AGG AGA CCC CGT TTG 695 
Ala Trp Arg Ala Lau Gly Gly He Arg Trp Gly Arg Arg Pro Arg Leu 
20 25 .30 

ACC CCT GAC CTC CGG GCC CTG CTG ACG TCA GGA ACT TCT GAC CCC CGG 743 
Thr Pro Asp Leu Arg Ala Leu Leu Thr Ser Gly Thr Ser Asp Pro Arg 
35 .40 45 

GCC CGA GTG ACT TAT GGG ACC CCC AG7 CTC TGG GCC CGG TTG TCT GTT 791 
Ala Arg Val Thr Tyr. Gly Thr Pro Ser Leu Trp Ala Arg Leu Ser Val 
50 55 go 

GGG GTC ACT GAA CCC CGA GCA TGC CTG ACG TCT GGG ACC CCG GGT CCC 83S 
Gly Val Thr GJLu Pro Arg Ala Cys Leu Thr Ser Gly Thr .Pro Gly Pro 
65 70 - 75 

CGG GCA CAA CTG ACT GCG GTG ACC CCA GAT ACC AGG ACC CGG GAG GCC 887 
Arg Ala Gin Lou Thr Ala Val Thr Pro Asp Thr Arg Thr Arg Glu Ala 
80 B5 90 95 

TCA GAG AAC TCT GGA ACC CGT TCG CGC GCG TGG CTG GCG GTG GCG CTG 935 
Ser Glu Aan Ser Gly Thr Arg Ser Arg Ala Trp Leu Ala Val Ala Leu 

100 105 110 
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GGC GCT GGG GGG GCA GTG CTG TTG TTG TTG TGG GGC GGG GGT CGG GGT 983 
Gly Ala Gly Gly Ala Val Leu Leu Leu Leu Trp Gly Gly Gly Arg Gly 
115 120 125 

CCT COS GCC GTC CTC GCC GCC GTC CCT AGC CCG CCG CCC GCT TCT CCC 1031 
Pro Pro Ala Val Leu Ala Ala Val Pro Ser Pro Pro Pro Ala Ser Pro 
130 135 HO 

CGG AGT CAG TAC AAC TTC A7C GCA GAT GTG GTG GAG AAG ACA GCA CCT 1079 
Arg Set: Gin Tyr Asn Phe'Ilo Ala Asp Val Val Clu Lye Thr Ala Pro 
US 130 135 

GCC GTG GTC TAT ATC GAG ATC CTG GAC CGG CAC CCT TTC TTC GGC *CGC 1127 
Ala Val Val Tyr lie Glu He Leu Asp Arg His Pro Phe Leu Gly Arg 
160 165 170 175 

GAG GTC CCT ATC TCG AAC GGC TCA GCA TTC GTG GTG GCT GCC GAT GGG 1175 
Glu Val Pro He Sar Asn Gly Ser Gly Phe Val Val Ala Ala Asp Gly 
180 185 190 

CTC ATT GTC AGC AAC GCC CAT GTG GTG GCT GAT CGG CGC AGA GTC CGT 1223 
Leu Ho Val Thr Asn Ala His Val Val Ala Asp Arg Arg Arg Val Arg 
195 200 205 

GTG AGA CTG CTA AGC GGC GAC ACG TAT GAG GCC GTG GTC ACA GCT GTG 1271 
Val Arg Leu L«u Sor Gly Asp Thr Tyr Glu Ala Val Val Thr Ala Val 
210 215 220 

GAT CCC GTG GCA GAC ATC GCA ACG CTG AGG ATT CAG ACT AAG GAG CCT 1319 
Asp Pro Val Ala Asp He Ala Thr Leu Arg He Gin Thr Lys Glu Pro 
225 230 235 

CTC CCC ACG CTG CCT CTG GGA CGC TCA GCT GAT GTC CGG CAA GGG GAG 1367 
Leu Pro Thr Leu Pro Lou Gly Arg Ser Ala Asp Val Arg Gin Gly Glu 
240 245 250 255 

TTT GTT GTT GCC ATG GGA AGT CCC TTT GCA CTG CAG AAC ACG ATC ACA 1415 
Phe Val Val Ala Met Gly Ser Pro Phe Ala Leu Gin Asn Thr He Thr 
260 265 270 

TCC GGC ATT GTT AGC TCT GCT CAG CGT CCA GCC AGA GAC CTG GGA CTC 14 63 
Sor Gly He Val Ser Ser Ala Gin Arg Pro Ala Arg Asp Leu Gly Leu 
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275 290 285 

CCC CAA ACC AAT G7G GAA TAC ATT CAA ACT GAT CCA GCT ATT GAT TTT 
Pro Gin Thr Asa Val Glu Tyr Xle Gin Thr Asp Ala Ala lie Asp Phe 
290 295 300 

GGA AAC TCT GGA GGT CCC CTG CTT' AAC CTG GCT AGG CAA CTG GGG GCT 
Cly Aan Ser Gly Qly Pro Leu Val Aan Leu Ala Arg Glu Leu Gly Ala 
305 310 31S 

GTA TCC CTG CAG GAT GGG GAG GTG ATT GGA GTG AAC ACC ATC AAG GTC 
Val S«r L*u Gin Asp Gly Glu Val lie Gly Val Asa Thr Met Lys Val 
320 325 330 335 

ACA GCT GGA ATC TCC TTT GCC ATC CCT TCT GAT CGT CTT CCA GAG TTT 
Thr Ala Gly tie Sar Phe Ala He Pro Ser Asp Arg Leu Arg Glu Ffte 
340 345 350 

CTG CAT CGT GGG GAA AAG AAG AAT TCC TCC TCC GGA ATC AGT GGG TCC 
Lau His Arg Cly Glu Lys Lys Aja Ser Ser Ser Gly Xle Ser Gly Ser 
355 360 3G5 

CAG CGG CGC TAC ATT GGG GTG ATG A7G CTG ACC CTG AGT CCC AGG GCT 
Gin Arg Arg Tyr He Gly Val Met Mat Leu Thr L«u Ser Pro Arg Ala 
370 375 3fl0 

GGT CTG CGG CCT GGT .GAT GTG ATT TTG GCC ATT GGG GAG CAG ATG GTA 
Gly Leu Arg Pro Cly Asp Val He Leu Ala He Gly Glu Gin Meb Val 
385 390 395 

CAA AAT GCT GAA GAT GTT TAT GAA GCT GTT CGA ACC CAA TCC CAG TTG 
Gin Ash Ala Glu Asp Val Tyr Glu Ala VaX Arg Thr Gin Ser Gin Leu 
400 405 410 415 

GCA GTG CAG ATC CGG CGG GGA CGA GAA ACA CTG ACC TTA TAT GTG ACC 
Ala Val Gin He Arg Arg Gly Arg Glu Thr Lqu Thr Leu Tyr Val Thr 
420 425 430 
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1511 



1359 



1607 



16S5 



1703 



1751 



1799 



1847 



1895 



CCT GAG GTC ACA GAA TGAATAGATC ACCAAGAGTA TGAGGCTCCT GCTCTGATTT CC 1952 
Pro Glu Val Thr Glu 
435 
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TCCTTGCCTT TCTGGCTGAG GTTCTGAGGG CACCGAGACA GAGGGTTAAA TGAACCAGTG 2012 

GGGGCAGGTC CCTCCAACCA CCAGCACTGA CTCCTGGGCT CTGAAGAATC ACAGAAACAC 2072 

TTTTTATATA AAATAAAATT ATACCTAGCA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 2132 

AAAAAAAAAA AA 2144 

3£Q XD NO: 29: 

Met Ala Ala Pro Arg Ala Gly Arg Gly Ala Gly Trp Ser Leu Arg Ala 
IS 10 15 

Trp Arg Ala Leu Gly Gly He Arg Trp Gly Arg Arg Pro Arg Lou Thr 

20 25 30 

Pro Asp Leu Arg Ala Lgu Lou Thr Ser Gly Thr Ser Asp Pro Arg Ala 

35 40 45 

Arg Val Thr Tyr Gly Thr Pro Ser Leu Trp Ala Arg Leu Ser Val Gly 

50 55 60 

Val Thr Glu Pro Arg Ala Cys Leu Thr Ser Gly Thr Pro Gly Pro Arg 
65 70 75 80 

Ala Gin Leu Thr Ala Val Thr Pro Asp Thr Arg Thr Arg Qlu Ala Ser 

.85 90 95 

Glu Asn Ser Gly Thr Arg- Ser Arg 'Ala Trp Leu Ala Val Ala Leu Gly 

100 105 110 

Ala Gly Gly Ala Val Leu Leu Leu Leu Trp Gly Gly Gly Arg Gly Pro 

113 120 125 

Pro Ala Val Leu Ala Ala Val Pro Ser Pro Pro Pro Ala Ser Pro Arg 

130 135 140 

Ser Gin Tyr Asn Phe He Ala Asp Val Val Glu* Lys Thr Ala Pro Ala 
145 150 155 160 

Val Val Tyr lie Glu lie Leu Asp Arg His Pro Phe Leu 'Gly Arg Glu 

165 170 175 

Val Pro Ho Ser Aon Gly Ser Gly Phe Val Val Ala Ala Aap Gly Leu 

180 185 ISO 

He Val Thr Asn Ala His Val Val Ala Asp Arg Arg Arg Val Arg Val 

195 200 205 

Arg Leu Leu Ser Gly Asp Thr Tyr Glu Ala Val Val Thr Ala Val Aap 

210 215 * 220 

Pro Val Ala Asp IXo Ala Thr Leu Arg Il« Gin Thr Lys Glu Pro Leu 
225 230 235 240 

Pro Thr Leu Pro Leu Gly Arg Ser Ala Asp Val Arg Gin Gly Glu Phe 

245 250 255 

Val Val Ala Met Gly Ser Pro Phe Ala Leu Gin Asn Thr He Thr Ser 

260 265 270 

Gly He Val Ser Ser Ala Gin Arg Pro Ala Arg Aap Leu Gly L«u Pro 
275 280 285 
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Gin Thr Asn Val Clu Tyr He Gin Thr Asp Ala Ala lie Asp PheGly 

290 295 300 

Asn Ser Gly Gly Pro Leu Val Asn Leu Ala Arg Glu Leu Gly Ala Val 
305 • 310 315 320 

Ser Leu Qln Asp Gly Glu Val lie Gly Val Asti Thr Met Lys Val rhr 

32S 330 335 

Ala Gly He Ser Phe Ala He Pro Ser Asp Arg Leu Arg Glu Phe Leu 

340 345 350 

His Arg Gly Glu Lys Lys Asn Ser Ser Sex Gly He Ser Gly Ser Gin 

355 360 - 365 

Arg Arg Tyr lie Gly Val Mat Met Leu Thr Leu Ser Pro Arg Ala Gly 

370 375 380 

Leu Arg Pro Gly Asp Val He Leu Ala He Gly Glu Gin Met Val Gin 
385 ' 390 3$5 400 

Asn Ala Glu Asp Val Tyr Glu Ala Val Arg Thr Gin Ser Gin Leu Ala 

405 410 415 

Val Gin lie Arg Arg Gly Arg Glu Thr Leu Thr Leu Tyr Val Thr Pro 

420 425 430 

Giu-Val Thr Glu 
435 

CEQ ID U0i3Qi 

WWTUKEtFolymorp2U.c variants at 672 and 1435 

CGTGGATCCC 6AGAAAGAGG CGCAGGACGA CGAGCCAGAA CCCGACTGGC GCGTAGAGCA 60 

GCAGCACGAG CAGTAGGAAG CAGrCACCCG GAAGCCTCGG GGCGAGAGGC GAAGTGGTCA 120 

GGCGCCGAAG GCCGAGAGCA CGCGGQGATC GGTCTCT1CC CGCCGGGTCT CTTACCGGTG 180 

CGAGTCAAAG AGCCGCTCCG GCCCCGGCCC TGACGGAAGC TCCATAACTG CTGCTTCAGG 240 

AGCGCCCGGC CGTCGCCGCC GCCGCCATTT TCGCGCCCGG CCOCAGGGGC TCTTGGGAAG 300 

GCGGAGTCTT TGGGCATCCG CCCGGGGTGA GGGGACCCGA AGTCCTGAGG CGCGCCGGAA 360 

GGGCTAGCGG TCCCAGCAXA .CCCCGCGGCC CCTTGGGCCG TCTCACAACT CGCGTCCGCC 420 

GGAGACCACA ATTCCCGCCA TTCCTGCGGC ACCGAGGAGT CCGCCTCCCC CAATCCTGCT 480 

CCCGGCGTGC ACTTCTGAAG GACTTCAGGT ACCGGCGTGC CCCGCGTCCT ACTGTCCGCC 340 

TGCTCGCGTC CTGGGTGCOG CCTCTGAGTA GGGCCGGCGA GGAGGCAGCC AAGGCGCACC 600 

TG AtG GCT GCG CCG AGG GCG GGG CGG GGT GCA GGC TGQ AGC CTT CGG 647 
Met Ala Ala Pro Arg Ala Gly Arg Gly AlaiGly Trp Ser Leu Arg 
1 5 10 15 

GCA TGG CGG GCT TTG GGG GGC ATT VGC TGG GGG AGG AGA CCC CGT TTG 695 
Ala Trp Arg Ala Leu Gly Gly Ho Xaa Trp Cly Arg Arg Pro Arg Leu 
20 25 30 
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ACC CCT GAC CTC CGG GCC CTG CTG ACG TCA 'GGA ACT TCT GAC CCC CCG 74 3 

Thr Pro Aap Leu Arg Ala Lou Lou Thr Ser Gly Thr Ser Asp Pro Arg 
35 40 45 

GCC CGA GTG ACT TAT GGG ' ACC CCC AGT CTC TGG GCC CGG TTG TCT GTT 7 91 

Ala Axg Val Thr Tyr Gly Thr Pro Ser Leu Trp Ala Arg Leu Ser Val 
50 55 60 

GGG GTC ACT GAA CCC CGA GCA TGC CTG ACG TCT GGG ACC CCG GGT CCC 8 39 

Gly Val Thr Glu Pro Arg Ala Cys Leu Thr Ser Gly Thr Pro Gly Pro 
65 70 75 

CGG GCA CAA CTG ACT GCG GTG ACC CCA GAT ACC AGG ACC CGG GAG GCC 887 
Arg Ala Gin Leu Thr Ala Val Thr Pro Asp Thr, Arg Thr Arg Glu Ala 
80 85 90 95 

TCA GAG AAC TCT GGA ACC CGT TCG CGC GCG TGG CTG GCG GTG GCG CTG 935 
Ser Glu Aan Ser Gly Thr Arg Ser Arg Ala Trp Leu Ala VaX Ala Lou 
100 105 110 

CGC GCT GGG GGG GCA GTG CTG TTG TTG TTG TGG GGC GGG GGT CGG GGT 983 
Gly Ala Gly Gly Ala Val Leu Leu Leu Leu Trp Gly Gly Gly Arg Gly . 
115 120 125 

CCT CCG GCC GTC CTC GCC GCC GTC CCT AGC CCG CCG CCC GCT TCT CCC 1031 
Pro Pro Ala Val Lou Ala Ala Val Pro Ser Pro Pro Pro Ala Ser Pro 
130 135 140 

CGG AGT CAG TAC AAC TTC ATC GCA GAT GTG GTG GAG AAG ACA GCA CCT 1079 
Arg Ser Gin Tyr Asn Phe lie Ala Asp Val Val Glu Lys Thr Ala Pro 
145 150 155 

GCC GTG GTC TAT ATC GAG ATC CTG GAC,, CGG CAC CCT TTC TTG GGC CGC 1127 
Ala Vol Val Tyr lie Glu lie Leu Asp Arg Hia Pro Phe Leu Gly Arg 
160 165 170 175 

GAG GTC CCT ATC TCG AAC GGC TCA GGA TTC GTG GTG GCT GCC GAT GGG . 1175 
Glu Val Pro lie Ser Asn Gly Ser Gly Phe Val Val Ala Ala Asp Gly 
180 185* 190 

CTC ATT "GTC ACC AAC GCC CAT Gf G~ GTG" GCT~ GAT CGG ~CGC~ AGA GTC CGT 1223 
Leu lie Val Thr Asn Ala His Val Val Ala Asp Arg Arg Arg Val Arg 
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195 200 205 

GTG AGA CTG CTA AGC GGC GAC ACG TAT GAG GCC GTG GTC ACA GCT GTG 1271 
Val Arg Leu Leu Ser Gly Asp Thr Tyx Glu Ala Val Val Thr Ala Val. 
.210 215 220 

GAT CCC GTC GCA CAC ATC GCA ACG CTG AGG ATT CAG ACT AAG GAG CCT 1319 
Asp Pro Val Ala Asp Ho Ala Thr Leu Arg lie Gin Thr Lya Glu Pro 
225 230 235 

CTC CCC ACG CTG CCT CTG GGA CCC TCA GCT CAT GTC CGG CAA GGG CAG 1367 
Leu Pro Thr Leu Pro. Leu Gly Ar<j Ser Ala Asp Val Arc; Gin Gly Clu 
2 *°. 245 250 255 

TTT GTT GTT CCC ATG GGA AGT CCC TTT GCA CTG CAG AAC ACG ATC ACA 1415 
Phe Val Val Ala Met Gly Ser Pro Phe Ala Leu Gin Asn Thr He Thr 
2*0 • 265 270 

TCC GGC ATT GTT AGC TCT YCT CAG CGT CCA GCC AGA GAC CTG GGA CTC 1463 
Ser Gly He Val Sar Ser Xaa Gin Arg Pro Ala Arg Aap Leu Gly Leu 
273 .280 285 

CCC CAA ACC AAT GTG GAA TAC ATT CAA ACT GAT GCA GCT ATT GAT TTT 1511 
Pro Gin Thr Aan Val Glu Tyr lie Gin Thr Aap Ala Ala lie Asp Pha 
290 29S 300 

CCA AAC TCT GGA GGT CCC CTG GTT AAC CTG GAT GGG GAG GTG ATT GGA 1559 
Gly Asn Ser Gly Gly Pro Leu Vai Asn Leu Asp Gly Glu Val lie Gly 
305 310 

GTG AAC ACC ATG AAG GTC ACA GCT GGA ATC TCC TTT GCC ATC CCT TCT 160*7 
Val Asn Thr Met Lyc Val Thr Ala Gly He Ser Ehe Ala He Pro Ser 
320 325 330 335 

GAT CGT CTT CGA GAG TTT CTG CAT CCT CCC GAA AAG AAG AAT TCC TCC 1655 
Aap Arg L«u Arg Glu Phe Leu His Arg Gly Glu Lys Lys Asn Ser Ser 
340 345 350 

TCC GGA ATC AGT. GGG TCC CAG CGG CGC TAC ATT GGG GTG ATG ATG CTG 1703 
Ser Gly il© ser Gly Ser Gin Arg Arg Tyr lie Gly Val Met Met Leu 
355 360 365 
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ACC CTG AGT CCC AGC ATC CTT GCT GAA CTA CAG CTT CGA GAA CCA AGC 1751 
Thr Lou Ser Pro Ser lie Leu Ala Glu Leu Gin Leu Arg Glu Pro Ser 
370 375 380 

TTT CCC .GAT GTT CAG CAT GGT GTA CTC ATC CAT AAA GTC ATC CTG GGC 1799 
Phe Pro Asp Val Gin Kis Gly Val Leu Il« Kia Lya Val He Leu Gly 
385 390 395 

TCC CCT GCA CAC CGG GCT GGT CTG CGG CCT GGT GAT CTG ATT TTG GCC 1847 
Ser Pro Ala His Arg Ala Gly Leu Arg Pro Gly Asp Val He Leu Ala 
400 405 410 415 

ATT .GGG GAG CAG ATG GTA CAA AAT GCT GAA GAT GTT TAX GAA GCT GTT 1895 
He' Gly Glu Gin Mot Val Clr\ A«n Ala Glu A«p Val Tyr- Glu Ala Val 
420 425 430 

CGA ACC CAA TCC CAG TTG GCA GTG CAG ATC CGG CGG GGA CGA CAA ACA 1943 
Arg Thr Gin Ser Gin Leu Ala Val Gin He Arg Arg Gly Arg Glu Thr 
435 440 445 



CTG ACC TTA TAT GTG ACC CCT GAG GTC ACA GAA TGAATAGATC ACCAAGAGTA 1996 
Leu Thr Leu Tyr Val Thr Pro Glu Val Thr Glu 
450 455 

TGAGGCTCCT GCTCTGATTT CCTCC7TGCC TTTCTfcGCTG AGGTTCTGAG GGCACCGAGA 
CAGAGGGTTA AATGAACCAG TCCCGGCAGG TCCCTCCAAC CACCAGCACT GACTCCTGGG 
CTCTGAAGAA TCACAGAAAC ACT T TTT ATA TAAAATAAAA T TAT ACC TAG CAftCATAAAA 
AAAAAAAAAA A 

SEQ ZD HOi3li 
Feature 

24 Xsa m jixg- ox Cyst 
278 Xjlb a JUa ox Val 

■» 

Mot Ala Ala Pro Arg Ala Gly Arg Gly Ala Gly Trp Ser Leu Arg Ala 

15 10 15 

Trp Arg Ala Leu Gly Gly He Xaa Trp Gly Arg Arg Pro Arg Leu Thr 

20 25 30 

Pro Asp Leu Arg Ala Leu Leu Thr Ser Gly Thr Ser Asp Pro Arg Ala 

35 40 45 
Arg Val Thr Tyr Gly Thr Pro Ser Leu Trp- Ala Arg Leu Ser Val Gly 
50 55 60 



2056- 
2116 
2176 
2187 
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Val Thr Glu Pro Arg Ala Cys Lou Thr S«r Gly Thr Pro Gly Pro Arg 
65 70 75 80 

Ala Cln Leu Thr Ala Val Thr Pro Asp Thr Arg Thr Arg Glu Ala Ser 

85 90 95 

Glu Asa .3er Gly Thr Arg 5er Arg Ala Trp Leu Ala Val Ala Leu Gly 

100 105 110 

Ala Gly Gly Ala Val Leu Leu Leu Lou Trp Gly Gly Gly Arg Gly Pro 

115 120 125 

Pro Ala Val Leu Ala Ala Val Pro Ser Pro Pro Pro Ala Ser Pro Arg 

130 135 140 

Ser Gin Tyr Asn Phe lie Ala Asp Val Val Glu Lys Thr Ala Pro Ala 
145 150 155 l€0 

Val Val Tyr He Glu lie Leu Asp Arg His Pro Phe Leu Gly Arg Clu 

1*5 170 175 

Val Pro He Ser Asn Gly Ser Gly Phe Val Val Ala Ala Asp Gly Leu 

180 185 190 

Ho Val Thr Asn Ala His Val Val Ala Asp Arg Arg Arg Val Arg Val 

195 ~ _ '" ' 200 205 

Arg Leu Leu Ser Gly Asp Thr Tyr Glu Ala Val Val Thr Ala Val Asp 

210 215 220 

Pro Val Ala Asp He Ala Thr Leu Arg He Gin Thr Lys Glu Pro Leu 
225 230 235 240 

Pro Thr Leu Pro Leu Gly Arg Ser Ala Asp Val Arg. Gin Gly Glu Phe 

245 250 255 

Val Val Ala Met Gly Ser Pro Phe Ala Leu Gin Asn Thr XI© Thr Ser 

260 265 270 

Gly He Val Ser Ser Xaa Gin Arg Pro Ala Arg Asp Leu Gly Leu Pro 

275 280 285 

Gin Thr Asn Val Glu Tyr He Gin Thr Asp Ala Ala He Asp Phe Gly 

290 295 300 

Asn Ser Gly Gly Pro Leu Val Asn Leu Asp Gly Glu. Val He Gly Val 
305 310 315 320 

Asn Thr Met Lys Val Thr Ala Gly He Ser Phe Ala He Pro Ser Asp 

32S 330 -335 

Arg Leu Arg Glu Pha Leu His Arg Gly Glu Lys Lys Asn Ser Ser Ser 

340 345 350 

Gly He Ser Gly Ser Gin Arg Arg Tyr He Gly Val Met Met Leu Thr 

355 360 365 

Leu Ser Pro Ser tie Leu Ala Glu Leu Gin Leu Arg Glu Pro Ser Phe 
370 375 . 380 



Pro Asp Val Gin His Gly Val Leu He His Lys Val He Leu Gly Ser 
365 390 * 395 400 
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Pro Ala His Arg Ala Gly Leu Arg 
405 

Gly Glu Gin Met Val Gin Ajn Ala 
420 

Thr Gin Ser Gin Leu Ala Val Gin 
435 440 
Thr Leu Tyr Val Thr Pro Glu Val 
450 455 

6ZQ XD KOi32t 

CATCCGGCAT TGTTAGCTCT GC 22 
3SQ ID NO:33: 

CATCCGGCAT TGTTAGCTCT GT 22 
SEQ ID HO:34: 

CAATAGCTGC ATCAGTTTGA ATG 23 

SEQ ID NO: 35: 
TGGCGGGCTT TGGGGGGCAT TC 22 

SEQ ID NO: 36: 

TGGCGGGCTT TGGGGGGCAT TT 22 
SEQ ZD WOi37i 

GACGTCAGCA GGGCCCGGAG GTC 23 
SEQ ID HO: 38: . 

GATACCCCAG CftGAAGCTGG 20 



Pro GLy Asp Val lie Leu Ala lie 

410 415 
Glu Asp Val. Tyr- Glu Ala Val Arg 
425 430 
He Arg Arg Gly Arg Glu Thr Leu 
445 

Thr GLu 



SEQ ID "NO: 39; 

GATACCCCAG CAGAAGCTGT 20 
SEQ ID HO; 40: 

GCTGACATCA TTGGCGG&GA C 21 
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JEM* fti^^O 
Claims 

1. An isolated polynucleotide encoding a biologically active PSP1 polypeptide. 

2. An isolated polynucleotide selected from the group consisting of: 

(a) a polynucleotide encoding PSP l-l having the nucleotide sequence as set forth in SEQ 
ID NO; 24 from nucleotide 603 to 1979; 

(b) a polynucleotide encoding PSPl-2 having the nucleotide sequence as set forth in SEQ 
ID NO: 23 from nucleotide 603 to 1979; 

(c) a polynucleotide encoding PSP1-3 having the nucleotide sequence as set forth in SEQ 
ID NO: 26 from nucleotide 603 to 1736; 

(d) a polynucleotide encoding PSP 1-4 having the nucleotide sequence as set forth in SEQ 
ID NO: 28 from nucleotide 603 to 1913; and 

(e) a polynucleotide encoding D87257 (1325T) protein. 

'3. An isolated polynucleotide substantially similar to SEQ ID NO: 24; SEQ ID NO: 23; 
SEQ ID NO: 26; or SEQ ED NO: 2S. 

4. An isolated polynucleotide as claimed in claim 2 or 3 wherein nucleotides 672 and 1435 
are independently selected from C and T. • 

5. An isolated polynucleotide having the nucleotide sequence as set forth in SEQ ID NOs: 
23, 24. 26. 28, or 30 or SEQ ID NO: 17 wherein nucleotide 1325 is T. 

6. A functional polypeptide encoded by the polynucleotide of any one of claims I to 5. 

7. The functional polypeptide of claim 6 which is: 

PSP1-1 having the amino acid sequence set fortt^ in SEQ ID NO: 25 or 30; 

PSPl-2 having the amino acid sequence set forth in SEQ ID NO: 8; 

PSP1-3 having the amino acid sequence set forth in SEQ ID NO: 27; 

PSP I -4 having the amino acid sequence set forth in SEQ ID NO: 29; or 

D87257 (1325T) protein having the amino acid sequence set forth In SEQ ID NO: 18 

wherein amino acid residue 213 is val. 

8. The polynucleotide as claimed in of any one of claims I to 5 which is DNA or RNA. 

9. A vector comprising the DNA of claim 8. 
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10. A recombinant host cell comprising the vector of claim 9. 

1 1 . A method for preparing essentially pure PSP1 protein or D87257 (1325T) protein 
comprising culturing the recombinant host cell of claim 10 under conditions promoting 
expression of the protein and recovering the expressed protein. 

12. PSP1 or D87257 (1325T) produced by the process of claim 11. 

1 3. An antisense oligonucleotide comprising a sequence which is capable of binding to the 
polynucleotide of any one of claims I to 5 or D 8725 8. 

14. A modulator of the polypeptide of cUirn 6 or of D87258, 

1 5. A method for assaying a medium for the presence of a substance that modulates PSP I 
or D87258 activity: 

(i) by affecting die binding of PSP1 or D87258 to cellular binding partners comprising the 
steps of: 

(a) providing a PSP1 polypeptide of claim 8 or D872S8 protein, or a junctional 
derivative thereof and a cellular binding partner or synthetic analog thereof; 

(b) incubating with a test substance which is suspected of modulating PSP 1 or 
D87258 activity under conditions which'permit the formation of a PSP I or D87258 
protein/cellular binding partner complex; 

(c) assaying for the presence of the complex, free PSP1 or D87258 protein or free 
cellular binding partner; and 

(d) comparing to a control to determine the effect of the substance; 

(ti)by inhibiting proteolytic activity on a cellular substrate comprising the steps of: 

(a) providing a PSP I polypeptide If claim 8 or D87258, or a functional derivative 
thereof and a cellular substrate or synthetic analog thereof, 

(b) incubating with a test substance which is suspected of inhibiting PSPt or 
DS7258.activity under conditions which permit the formation of a PSP I or D87258 
enzyme/substrate complex and subsequent cleavage of the substrate; 

(c) assaying for the presence of proteolytically cleaved substrate; and 

(d) comparing to a control to determine the effect of the substance. 

(iii) by direct binding to PSPI or D87258 protein comprising the steps of: 
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(a) providing a labelled PSP 1 polypeptide of claim 8 or D87258, or a functional 
derivative thereof, 

(b) providing solid support-associated modulator candidates; 

(c) incubating a mixture of the labelled PSPI or D87258 protein with the support- 
associated modulator candidates under conditions which can permit the formation of a 
PS?1 or D87258 protein/modulator candidate complex; 

(d) separating the solid support froni free soluble-labelled PSPI or D87258 protein; 

(e) assaying for the presence of solid support-associated labelled protein; 

(f) Isolating the solid support coraplexed with labelled PSPI or D87258 protein; 

and 

(g) identifying the modulator candidate, 

16. PSPI orD87258 protein modulating compounds identified by the method of claim 15. 

17. The use of a modulating compound of claim 16 in the manufacture of a medicament for 
treating of a patient having need to modulate PSP 1 or D87258 activity. 

1 8. A pharmaceutical composition comprising the modulating compound of claim 1 6 and a 
pharmaceutical Iy acceptable carrier. 

1 9. A method of diagnosing conditions associated with PSPI or D87258 protein deficiency 
which comprises: 

(a) isolating a polynucleotide sample from an individual; 

(b) assaying the polynucleotide sample and a polynucleotide encoding PSPI 
having the nucleotide sequence as set forth in SEQ IDNOs: 23, 24, 26, 28 or 30 or a 
polynucleotide encoding D87258 as set forth in SEQ ID NO: 18; and 

(c) comparing differences between the polynucleotide -sample and the PSPI or 
DS7258 polynucleotide, wherein any differences indicate mutations in the PSPI or D87258 
sequence. 

•% 

20. A method of treating conditions which arc related to insufficient PSPI orD87258 
protein function which comprises: 
(i) the steps of: 

(a) isolating cells from a patient deficient in PSPI or D87258 protein function; 

(b) altering the cells by transacting the polynucleotide of any one of claims I to 7 # 
or a polynucleotide encoding D87258 as set forth in SEQ ID NO: 18 into the cells wherein 
a PSPI or D87258 protein is expressed; and 
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(c) introducing the cells back to the patient to alleviate the condition; or 
<ii) administering the polynucleotide of any one of claim* I to 5, or a polynucleotide 
encoding D87258 as set forth in SEQ ID NO: 1 8, to a patient deficient in PSP i or D87258 
protein function wherein a PSP1 or D87258 protein is expressed and alleviates the 
condition. 



21. An antibody iinmuaoreactive with PSPl, D87258 or an immunogen thereof. 

22. A transgenic non-human animal capable of expressing in any cell thereof the DNA of 
olaim 5 or a polynucleotide encoding D87258 as set forth in SEQ ID NO: 18. 

23. A method for determining the genetic predisposition to neurodcg encration in a patient 
comprising detecting PSPl or D87258 polymorphisms in a sample from a patient, 
preferably neurodegeneratton predisposition to Alzheimer's disease. 

24. The method of claim 23 wherein the polymorphisms detected are at nucleotide 672 of 
PSPl, at nucleotide 1435 of PSPl or at nucleotide 1325 of D87258. 

25. The method of claim 24 wherein the polymorphisms are detected by polymerase chain 
reaction, preferably wherein the oligonucleotides used with the polymerase chain reaction 
have a nucleotide sequence selected from die group consisting of SEQ ID NOs: 32, 33, 34, 
35, 36, 37, 38, 39, or 40. 

26- An isolated polynucleotide having the nucleotide sequence as set forth in SEQ ID NO: 
32, 33, 34, 35, 36 ,37, 3.8, 39, or 40. 

37, An oligonucleotide pair comprising oligonucleotides having the nucleotide sequence as 
set forth in: 

(a) SEQ ID NOs: 32 and 34; 

(b) SEQ ID NOs: 33 and 34; 

(c) SEQ ID NOs: 35 and 37; 
Cd) SEQIDNOs:36and37; 
(c) SEQ ID NOs: 38 and 40; or 
(0 SEQ ID NOs: 39 and 40. 
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• . , 

41 SGTS DPRRJIVTYGT PS LWARLSVGVTE PRACLTSGT PGP RAQLTAVT P DT 90 

• I.. : • I : . . : . : . . I I 1 I . . . . | . : . 

2 KKTTItAI*SRIjALSLGLALSPLSATAAETSSATTAQQMPSLAPMLEKV/MPS 51 

140 



SO 

190 

122. 
« 

191 . GLI VTNAHVVADRRRVRVRLLSG DT YEAWTAVD P VAD IATLRI QTKE P 239 

l.slll.lll.s :s|.| .| .::| : . : || ,||| :.||... 
123 KGWVTNNHVVDNATVIKVQLSOGRKFOAKMVGK0PRSDIALIQIQNPKN 172 

240 LPTLPLGRSADVRQGErWAMGSPFALQOTITSGIVSSAQRPARDLGLPO 289 

I • • 5 • s s I • • : I I : : • I ; : I - I I ; f . : I : I I I | | I . . I . | | . 
173 LT AIKMADS DALRVGD YT VGIGN P FGLGETV? S G I VS ALGRS GLNA 218 

290 TNVE . YIQTDAAIDFGNSGGPLVNLDGEVTGVNTMKVrA GISFAI 333 

• I I --MIMri: I I II I : I I I I : I I ; | I ; | | I I : I I I 

219 ENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGWIGCGFAI 268 

334 PSDRLREF LHRGE 34 6 

!(:.:::: : . I M 

269 PSNMVKNLTSQMVEYGQVKE^EXGIMGTEI^NSELAKAMKVDAQRGAFVSQ 318 
• 

347 .KKNSSSGISG.. SQRRYIGVM. . . .MLTL. • . 369 

• I I I • : * I .1:1.1 .||| 

319 VLPNS S AAKAGIKAG D VI T SLNGKP I S S FAALRAQ VGTMPVGS KLVLGLL 348/ 

• • • ■ • 
370 SPSILAELQLREPSFPDVQHGVLIHKVTL 398 

KM:*::: I I .: ::!!:::. I 
369 RDGKQVNV^LELQQSSQNQVDSSSIFNGIEGAEMSNKGKDQGVVVNNVKT 418 

* • • • • 
399 GS PAHRAGLRPGOVI LAIGEQMVQNAEDVYEAVRTQ - SQLAVQIRRGRET 44 7 

(•II • II : • I I I I : : . : I I . I . : : . . : . . I | | : . | . | [ 
419 GTPAAQIGLKKGDVIIGANQQAVKNIAELRKVIiDSKPSVIALNIQRGDRH 4 68 

448 LTLYVTPEVTE 458 
I.: 

469 LPVNAVISLNP 479 * 



• • • 

91 RTREAS ENSGTRSElAWLAVALGAGGAVXiLLLWGGGRGPPAVLAAVPS P P P 
" • ■ I • I « I • : : : : • : . . ; : I : : . . : . . . | | . 
52 WSIMVEGSTTVOTPRMPRNPQQ FFGDD SPFCQEG5PFQ 

141 ASPRSQYNFIADVVEKTAPAWYIEILDRHPFLGREVPISNGSGFVVAAD 
* I I : I : : . : ' I I I . : : . I I 

91 SSPFCQ. . . GGQGGNGGGQQQKFMAL . . GSGVTIDAD 
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i 

PSP1--2 CGTGGATCCC 
PSPi-1 CGTGGATCCC 
PSP1-3 CGTGGATCCC 
PSP1-4 CGTGGATCCC 



GAGAAAGAGG CGCAGGACCA 
GAGAAAGAGG CGCAGGACGA 
GAGAAAGAGG CGCAGGACGA 
GAGAAAGAGG CGCAGGACGA 



50 

GGAGGCAGAA CCCGACTGGC 
GGAGGCAGAA CCCGACTGGC 
GGAGGCAGAA CCCGACTGGC 
GGAGGCAGAA CCCCAC7GGC 



SI 

PSP1-2 GCGTAGAGCA GCAGCACGAG 
PSP1-1 GCGTAGAGCA GCAGCACGAG 
PSP1-3 GCGTAGAGCA GCAGCACGAG 
PSP1-4 GCGTAGAGCA GCAGCACGAG 



100 

CAGTAGGAAG CAGTCACCCG GAAGCCT GGG 
CAGTAGGAAG CAGTCACCCG GAAGCCTGGG 
CAGTAGGAAG CAGTCACCCG GAAGCCTGGG 
CAGTAGGAAG CAGTCACCCG GAAGCCTGGG 



101 

P5P1-2 GGCGAGAGGC GAAGTGGTCA 
FSP1-1 GGCGAGAGGC GAAGTGGTCA 
PSPl-3 GGCGAGAGGC GAAGTGGTCA 
PSF1-4 GGCGAGAGGC GAAGTGGTCA 

151 

PSP1-2 GGTCTCTTCC CGCCGGGTCT 
PSP1-1 GGTCTCTTCC CGCCGGGTCT 
PSP1-3 GGTCTCTTCC CGCCGGGTCT 
■PSP1-4 GGTCTCTTCC CGCCGGGTCT 



150 

GGCGCCGAAG GCCGAGAGCA CGCGGGGATC 
GGCGCCGAAG GCCGACJ^GCA CGCGGGGATC 
GGCGCCGAAG GCCGAGAGCA CGCGGGGATC 
GGCGCCGAAG GCCGAGAGCA CGCGGGGATC 

200 

CTTACCGGTG CGAGTCAAAG AGCCGCTCCG 
CTTACCGGTG CGAGTCAAAG AGCCGCTCCG 
CTTACCGGTG CGAGTCAAAG AGCCGCTCCG 
CTTACCGGTG CGAGTCAAAG AGCCGCTCCG 



201 

P3P1-2 GCCCCGGCCC TGAGGGAAGC 
PS P 1-1 GCCCCGGCCC TGAGGGAAGC 
PSP1-3 GCCCCGGCCC TGAGGGAAGC 
PSP1-4 - GCCCCGGCCC TGAGGGAAGC 

251 

PSP1-2 CGTCGCCGCC GCCGCCATTT 
PSP1-1 CGTCGCCGCC GCCGCCATTT 
PS PI- 3 CGTCGCCGCC GCCGCCATTT 
FSP1-4 CGTCGCCGCC GCCGCCATTT 



• 250 

TCCATAACTG CTGCTTCAGG AGCGCCCGGC 
TCCATAACTG CTCCTTCAGG AGCGCCCGGC 
TCCATAACTG CTGCITCAGG AGCGCCCGGC 
TCCATAACTG CTGCTTCAGG AGCGCCCGGC 

300 

TCGCGCCCGG CCGCAGGGGC TCTTGGGAAG 
TCGCGCCCGG CCGCAGGGGC TCTTGGGAAG 
TCGCGCCCGG CCGCAGGGGC TCTTGGGAAG 
TCGCGCCCGG CCGCAGGGGC TCTTGGGAAG 
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301 350 
PSP1-2 GCGGAGTCTT TGGGCATCCG CCCGGGGTGA GGGGACCCGA AGTCCTGAGG 
PSPl-1 GCGGAGTCTT TGGGCATCCG CCCGGGGTGA GGGGACCCGA AGTCCTGAGG 
PSP1-3 GCGGAGTCTT TGGGCATCCG CCCGGGGTGA GGGGACCCGA AGTCCTGAGG 
PSP1-4 GCGGAGTCTT TGGGCATCCG CCCGGGGTGA GGGGACCCGA AGTCCTGAGG 

351 400 
PS PI -2 CGCGCCGGAA GGGCTAGCGG TCCCAGCATA CCCCGCGGCC CCTTGGGCCG 
PSPl-1 CGCGCCGGAA GGGCTAGCGG TCCCAGCATA CCCCGCGGCC CCTTGGGCCG 
PSP1-3 CGCGCCGGAA GGGCTAGCGG TCCCAGCATA CCCCGCGGCC CCTTGGGCCG 
FSP1-4 CGCGCCGGAA GGGCTAGCGG TCCCAGCATA CCCCGCGGCC CCTTGGGCCG 

- 401 450 

PS PI -2 TCTCACAACT CCCGTCCGGC GGAGACCACA ATTCCCCGCA TTCGTGGGGC 

PSPl-l TCTCACAACT CGCGTCCGGC GGAGACCACA ATTCCCGGCA TTCGTGGGGC 

PSP1-3 TCTCACAACT CGCGTCCGGC GGAGACCACA ATTCCCGGCA TTCGTGGGGC 

. PSP1-4 TCTCACAACT CGCGTCCGGC- GGAGACCACA ATTCCCGGCA TTCGTGGGGC 

451 500 
PSP1-2 AGGGAGGAGT CGGCCTCCCG GAATCCTGGT CCCGGCGTGC ACTTCTGAAG 
PSPL-1 AGGGAGGAGT CGGCCTCCCG GAATCCTGGT CCCGGCGTGC ACTTCTGAAG 
PSP1-3 AGGGAGGAGT CGGCCTCCCG GAATCCTGGT CCCGGCGTGC ACTTCTGAAG 
PS PI -4 AGGGAGGAGT CGGCCTCCCG GAATCCTGGT CCCGGCGTGC ACTTCTGAAG 

501 550 
PSP1-2 GACTTCACGT ACCGGCGTGC CCCGCGTCCT ACTGTCCGCC TGCTCGCGTC 
PSP1-1 GACTTCACGT ACCGGCGTGC CCCGCGTCCT ACTGTCCGCC TGCTCGCGTC 
PS PI -3 GACTTCAGGT ACCGGCGTGC CCCGCGTCCT ACTGTCCGCC TGCTCGCGTC 
PSP1-4 GACTTCAGGT ACCGGCGTGC CCCGCGTCCT ACTGTCCGCC TGCTCGCGTC 

551 •» 600 

PS Pl-2 CTGGGTGCCG CCTCTGAGTA GGGCGGGCGA GGAGGCAGCC AAGGCGGAGC 
BSPl-l CTGGGTGCCG CCTCTGAGTA GGGCGGGCGA GGAGGCAGCC AAGGCGGAGC 
PSP1-3 CTGGGTGCCG CCTCTGAGTA GGGCGGGCGA GGAGGCAGCC AAGGCGGAGC 
PSP1-4 CTGGGTGCCG CCTCTGAGTA GGGCGGGCGA GGAGGCAGCC AAGGCGGAGC 
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601 650 
PSPL-2 TGATGGCTGC GCCGAGGGCG GGGCGGGGTG CAGGCTGGAG CCTTCGGGCA 
PSPl-l TGATGGCTGC GCCGAGGGCG GGGCGGGGTG CAGGCTGGAG CCTTCGGGCA 
PSP1-3 TGATGGCTGC GCCGAGGGCG GGGCGGGGTG CAGGCTGGAG CCTTCGGGCA 
PSPl-4 TGATGGCTGC GCCGAGGGCG GGGCGGGGTG CAGGCTGGAG CCTTCGGGCA 

651 700 
PSP1-2 TGGCGGGCTT TGGGGGGCAT TTGCTGGGGG AGGAGACCCC GTTTGACCCC 
PSPl-l TGGCGGGCTT TGGGGGGCAT TCGCTGGGGG AGGAGACCCC GTTTGACCCC 
PS PI -3 TGGCGGGCTT TGGGGGGCAT TCGCTGGGGG AGGAGACCCC GTTTGACCCC 
PS PI -4 TGGCGGGCTT TGGGGGGCAT TCGCTGGGGG AGGAGACCCC GTTTGACCCC 

701 750 
PS PI -2 TGACCTCCGG GCCCTGCTGA CGTCAGGAAC TTCTGACCCC CGGGCCCGAG 
PSPl-l TGACCTCCGG GCCCTGCTGA CGTCAGGAAC TTCTGACCCC CGGGCCCGAG 
PS PI -3 TGACCTCCGG GCCCTGCTGA CGTCAGGAAC TTCTGACCCC CGGGCCCGAG 
PSP1-4 TGACCTCCGG GCCCTGCTGA CGTCAGGAAC TTCTGACCCC CGGGCCCGAG 

751 _ 800 

PSP1-2 TGACTTATGG GACCCCCAGT CTCTGGGCCC GGTTGTCTGT TGCCGTCACT 
PSPl-l TGACTTATGG GACCCCCAGT CTCTGGGCCC GGTTGTCTGT TGGGGTCACT 
PSP1-3 TGACTTATGG GACCCCCAGT CTCTGGGCCC GGTTGTCTGT TGGGGTCACT 
PSP1-4 TGACTTATGG GACCCCCAGT CTCTGGGCCC GGTTGTCTGT TGGGGTCACT 

801 850 
PSP1-2 GAACCCCGAG CATGCCTGAC GTCTGGGACC CCGGGTCCCC GGGCACAACT 
PSPl-l GAACCCCGAG CATGCCTGAC GTCTGGGACC CCGGGTCCCC GGGCACAACT 
PSPl-3 GAACCCCGAG CATGCCTGAC GTCTGGGACC CCGGGTCCCC GGGCACAACT 
PSP1-4 GAACCCCGAG CATGCCTGAC GTCTGGGACC CCGGGTCCCC GGGCACAACT 

851 ^ 900 

PSP1-2 GACTGCGGTG ACCCCAGATA CCAGGACCCG GGAGGCCTCA GAGAACTCTG 
PSPl-l GACTGCGGTG ACCCCAGATA CCAGGACCCG GGAGGCCTCA GAGAACTCTG 
PS PI -3 GACTGCGGTG ACCCCAGATA CCAGGACCCG GGAGGCCTCA GAGAACTCTG 
PSP1-4 GACTGCGGTG ACCCCAGATA CCAGGACCCG GGAGGCCTCA GAGAACTCTG 
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901 • 950 

PSP1-2 GAACCCGTTC GCGCGCGTGG CTGGCGGTGG CGCTGGGCGC TGGGGGGGCA ■ 
PSP1-1 GAACCCGTTC GCCCCCCTCC CTGGCGGTGG CGCTGGGCGC TGGGGGGGCA 
PSP1-3 GAACCCGTTC GCGCGCGTGG -CTGGCGGTGG CGCTGGGCGC TGGGGGGGCA 
PSP1-4 GAACCCGTTC GCGCGCGTGG CTGGCGGTGG CGCTGGGCGC TGGGGGGGCA 

951 1000 
PSP1-2 GTGCTGTTGT TGTTGTGGGG CGGGGGTCGG GGTCCTCCGG CCGTCCTCGC 
PSPi-l GTGCTGTTGT TGTTGTGGGG CGGGGGTCGG GGTCCTCCGG CCGTCCTCGC 
PSPI-3 GTGCTGTTGT TGTTGTGGGG CGGGGGTCGG GGTCCTCCGG CCGTCCTCGC 
PSP1-4 GTGCTGTTGT TGTTGTGGGG CGGGGGTCGG GGTCCTCCGG CCGTCCTCGC 

1001 1050 
PSP1-2 CGCCGTCCCT AGCCCGCCGC CCGCTTCTCC CCGGAGTCAG TACAACTTCA 
PSP1-1 CGCCGTCCCT AGCCCGCCGC CCGCTTCTCC CCGGAGTCAG TACAACTTCA 
P3P1-3 CGCCGTCCCT AGCCCGCCGC CCGCTTCTCC CCGGAGTCAG TACAACTTCA 
PSP1-4 CGCCGTCCCT AGCCCGCCGC CCGCTTCTCC CCGGAGTCAG TACAACTTCA 

1051 1100 
PSP1-2 TCCCAGATGT GGTGGAGAAG ACAGCACCTG CCGTGGTCTA TATCGAGATC 
PSP1-1 TCGCAGATGT GGTGGAGAAG ACAGCACCTG CCGTGGTCTA TATCGAGATC 
PSPI-3 TCGCAGATGT GGT GG AG AAG ACAGCACCTG CCGTGGTCTA TATCGAGATC 
PSP1-4 TCGCAGATGT GGTGGAGAAG ACAGCACCTG CCGTGGTCTA TATCGAGATC 

1101 1150 
PSP1-2 CTGGACCGGC ACCCTTTCTT GGGCCGCGAG GTCCCTATCT CGAACGGCTC 
PSP1-1 CTGGACCGGC ACCCTTTCTT GGGCCGCGAG GTCCCTATCT CGAACGGCTC 
PSPI-3 CTGGACCGGC ACCCTTTCTT GGGCCGCGAG GTCCCTATCT CGAACGGCTC 
PSP1-4 CTGGACCGGC ACCCTTTCTT GGGCCGCGAG GTCCCTATCT CGAACGGCTC 

1151 1200 
PSP1-2 AGGATTCGTG GTGGCTGCCG ATG^GCTCAT TGTCACCAAC GCCCATGTGG 
PSP1-1 AGGATTCGTG GTGGCTGCCG ATGGGCTCAT TGTCACCAAC GCCCATGTGG 
PSPI-3 AGGATTCGTG GTGGCTGCCG ATGGGCTCAT TGTCACCAAC GCCCATGTGG 
PSP1-4 AGGATTCGTG GTGGCTGCCG ATGGGCTCAT TGTCACCAAC GCCCATGTGG 
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1201 

PSP1-2 TGGCTGATCG GCGCAGAGTC 
PSPI-1 TGGCTGATCG GCGCAGAGTC 
PSP1-3 TGGCTGATCG GCGCAGAGTC 
PS PI -4 TGGCTGATCG GCGCAGAGTC 

1251 

PSP1-2 GAGGCCGTGG TCACAGCTGT 
PSPI-1 GAGGCCGTGG TCACAGCTGT 
PSPl-3 GAGGCCGTGG TCACAGCTGT 
PSF1-4 GAGGCCGTGG TCACAGCTGT 



12S0 

CGTGTGAGAC TGCTAAGCGG CGACACGTAT 
CGTGTGAGAC TGCTAAGCGG CGACACGTAT 
CGTGTGAGAC TGCTAAGCGG CGACACGTAT 
CGTGTGAGAC TGCTAAGCGG CGACACGTAT 

1300 

GGATCCCGTG GCAGACATCG CAACGCTGAG 
GGATCCCGTG GCAGACATCG CAACGCTGAG 
GGATCCCGTG GCAGACATCG CAACGCTGAG 
GGATCCCGTG GCAGACATCG CAACGCTGAG 



1301 

PSP1-2 GATTCAGACT 
PSPI-1 GATTCAGACT 
PS PI - 3 GATTCAGACT 
PS PI - 4 GATTCAGACT 



AAGGAGCCTC TCCCCACGCT 
AAGGAGCCTC TCCCCACGCT 
AAGGAGCCTC TCCCCACGCT 
AAGGAGCCTC TCCCCACGCT 



1350 

GCCTCTGGGA CGCTCAGCTG 
CCCTCTGGGA. CGCTCAGCTG 
GCCTCTGGGA CGCTCAGCTG 
CCCTCTGGGA CGCTCAGCTG 



13S1 

PSP1-2 ATGTOCGGCA 
PSP1-1 ATGTOCGGCA 
PSP1-3 ATGTOCGGCA 
PSP1-4 ATGTOCGGCA 



AGGCCAGTTT GTTGTTGCCA 
AGGGGAGTTT GTTGTTGCCA 
AGGGGAGTTT GTTGTTGCCA 
AGGGGAGTTT GTTGTTGCCA 



1400 

TGGGAAGTCC CTTTGCACTG 
TGGGAAGTCC CTTTGCACTG 
TGGGAAGTCC CTTTGCACTG 
TGGGAAGTCC CTTTGCACTG 



1401 

PSP1-2 CAGAACACGA 
PSPI-1 CAGAACACGA 
FSP1-3 CAGAACACGA 
PSP1-4 CAGAACACGA 

1451 

FSP1-2 AGACCTGGGA 
PSPI-1 AGACCTGGGA 
PSPL-3 AGACCTGGGA 
PSP1-4 AGACCTGGGA 



TCACATCCGG CATTGTTAGC 
TCACATCCGG CATTGTTAGC 
TCACATCCGG CATTGTTAGC 
TCACATCCGG CATTGTTAGC 



CTCCCOCAAA COAATGTGGA 
CTCCCCCAAA CCAATGTGGA 
CTCCCCCAAA CCAATGTGGA 
CTCCCOCAAA CCAATGTGGA 



1450 

TCTGCTCAGC GTCCAGCCAG 
TCTGCTCAGC GTCCAGCCAG 
TCTGCTCAGC GTCCAGCCAG 
TCTGCTCAGC GTCCAGCCAG 

1500 

ATACATTCAA ACTGATGCAG 
ATACAT TCAA ACTGATGCAG 
ATACATTCAA ACTGATGCAG 
ATACATTCAA ACTGATGCAG 
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1501 




1550 


PSPl-2 


CTATTGATTT TGGAAACTCT 


GGAGGTCCCC TGGTTAACCT 




PSPl-1 


CTATTGATTT TGGAAACTCT 


GGAGGTCCCC TGGTTAACCT 




PSP1-3 


CTATTGATTT TGGAAACTCT 


GGAGGTCCCC TGGTTAACCT 


GGTGAGTGAG 


PSP1-4 


CTATTGATTT TGGAAACTCT 


GGAGGTCCCC TGGTTAACCT 






1551 




1600 


PSPl-2 








PSPl-1 








PSPl-3 


ACATCCTTCC TTCCAAGAAT 


CCCTGCCCCA GGTCAGTGTG 


GGAAGGGTAG 


tSPl-4 








1601 




1650 


P3P1*2 








PSP1-1 








PSPl-3 


CTTTCCCCTA ATTCAAGGAT 


GT7TGGTCAA GTTTCTGAGC 


AGTTCTTTGT 


PSP1-4 








1651 




1700 


PSPl-2 








PSPl-1 








PSPl-3 


TGGCTATCTC TCAATATCCA 


ACCAGATCTC CCCAACACTT 


GCTGGTACTT 


PSP1-4 








1701 




1750 


PSP1-2 








PSPl-1 








PSPl-3 


TTGTTCGGGT GCCCCCATCC 


CCTACTATTT GTTTAGGCTA GGGAACTGGG 


PSPl-4 






GGGAACTGGG 



1751 1800 

PSPl-2 . . . . .GGATG GGGAGGTGAT TGGAGTGAAC ACCATGAAGG 

PSPl-1 GGATG GGGAGGTGAT TGGAGTGAAC ACCATGAAGG 

PSPl-3 GGCTGTATCC CTGCAGGATG GGGAGGTGAT TGGAGTGAAC ACCATGAAGG 
P3P1-4 GGCTGTATCC CTGCAGGATG GGGAGGTGAT TGGAGTGAAC ACCATGAAGG 
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1801 1850 
PSPl-2 TCACAGCTGG AATCTCCTTT GCCATCCCTT CTGATCGTCT TCGAGAGTTT 
PSPl-l TCACAGCTGG AATCTCCTTT GCCATCCCTT CTGATCGTCT TCGAGAGTTT 
PSP1-3 TCACAGCTGG AATCTCCTTT GCCATCCCTT CTGATCGTCT TCGAGAGTTT 
PSP1-4 TCACAdCTGG AATCTCCTTT GCCATCCCTT CTGATCGTCT TCGAGAGTTT 



1851 

PS P 1 - 2 CTGCATCGTG 
PS P 1 - 1 CTGCATCGTG 
PS P 1 - 3 CTGCATCGTG 
PS P 1 - 4 CTGCATCGTG 



GCGAAAAGAA GAATTCCTCC 
GGGAAAAGAA GAATTCCTCC 
GCGAAAAGAA GAATTCCTCC 
GGGAAAAGAA GAATTCCTCC 



1900 

TCCGGAATCA GTGGGTCCCA 
TCCGGAATCA GTGGGTCCCA 
TCCGGAATCA GTGGGTCCCA 
TCCGGAATCA GTGGGTCCCA 



1901 1950 
PSPl-2 GCGGCGCTAC ATTGGGGTGA TGATGCTGAC CCTGAGTCCC AGCATCCTTG 
PSPl-l GCGGCGCTAC ATTGGGGTGA TGATGCTGAC CCTGAGTCCC. AGCATCCTTG 
PSP1-3 GCGGCGCTAC ATTGGGGTGA* TGATGCTGAC CCTGAGTCCC AGCATCCTTG 
FSP1-4 GCGGCGCTAC ATTGGGGTGA TGATGCTGAC CCTGAGTCCC A 

1951 2000 
PSP1-2 CTGAACTACA GCTTCGAGAA CCAAGCTTTC CCGATGTTCA GCATGGTGTA 
PSPl-l CTGAACTACA GCTTCGAGAA CCAAGCTTTC CCGATGTTCA GCATGGTGTA 
PSP1-3 CTGAACTACA GCTTCGAGAA CCAAGCTTTC CCGATGTTCA GCATGGTGTA 
PSP1-4 



2001 2050 
PSPl-2 CTCATCCATA AAGTCATCCT GGGCTCCCCT GCACACCGGG CTGGTCTGCG 
PSP1-1 CTCATCCATA AAGTCATCCT GGGCTCCCCT GCACACCGGG CTGGTCTGCG 
PSP1-3 CTCATCCATA AAGTCATCCT GGGCTCCCCT GCACACCGGG CTGGTCTGCG 
P5F1-4 CCG CTGGTCTGCG 



2051 

PSP1-2 GCCTCCTGAT GTGATTTTGG 
PSP1-1 GCCTGGTGAT GTGATTTTGG 
PSF1-3 GCCTGGTGAT GTGATTTTGG 
PSP1-4 GCCTGGTGAT GTGATTTTGG 



2100 

CCATTGGGGA GCAGATGGTA CAAAATGCTG 
CCATTGGGGA GCAGATGGTA CAAAATGCTG 
CCATTGGGGA GCAGATGGTA CAAAATGCTG 
CCATTGGGGA GCAGATGGTA CAAAATGCTG 
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2101 

PSP1-2 AAGATGTTTA TGAAGCTGTT 
PSPl-1 AAGATGTTTA TGAAGCTGTT 
PSPl-3 AAGATGTTTA TGAAGCTGTT 
PSP1-4 AAGATGTTTA TGAAGCTGTT 



2150 

CCAACCCAAT CCCAGTTGGC AGTGCAGATC 
CGAACCCAAT CCCAGTTGGC AGTGCAGATC 
CGAACCCAAT CCCAGTTGGC AGTGCAGATC 
CGAACCCAAT CCCAGTTGGC AGTGCAGATC 



2151 

PSPl-2 CGGCGGGGAC GAGAAACACT 
PSP1-1 CGGCGGGGAC GAGAAACACT 
PSPl-3 CGGCGGGGAC GAGAAACACT 
PS PI -4 CGGCGGGGAC GAGAAACACT 

2201 

PSP1-2 ATGAATAGAT CACCAAGAGT 
PSPl-1 ATGAATAGAT CACCAAGAGT 
FSP1-3 ATGAATAGAT CACCAAGAGT 
PSP1-4 ATGAATAGAT CACCAAGAGT 

225L 

PSP1-2 CTTTCTGGCT GAGGTTCTGA 
PSPl-1 CTTTCTGGCT GAGGTTCTGA 
PSPi-3 CTTTCTGGCT GAGGTTCTGA 
PSPl-4 CTTTCTGGCT GAGGTTCTGA 

2301 

PSPi-2 GTGGGGGCAG GTCCCTCCAA 
PSPl-1 GTGGGGGCAG GTCCCTCCAA 
PSPI-3 GTGGGGGCAG GTCCCTCCAA 
PSPl-4 GTGGGGGCAG GTCCCTCCAA 

235L 

PSPI-2 ATCACAGAAA CACTTTTTAT 
PSPl-1 ATCACAGAAA CACTTTTTAT 
PSPl-3 ATCACAGAAA CACTTTTTAT 
PS PI -4 ATCACAGAAA CACTTTTTAT 



2200 

GACCTTATAT GTGACCCCTG AGGTCACAGA 
GACCTTATAT GTGACCCCTG AGGTCACAGA 
GACCTTATAT GTGACCCCTG AGGTCACAGA 
GACCTTATAT GTGACCCCTG AGGTCACAGA 

2250 

ATGAGGCTCC TGCTCTGATT TCCTCCTTGC 
ATGAGGCTCC TGCTCTGATT TCCTCCTTGC 
ATGAGGCTCC TGCTCTGATT TCCTCCTTGC 
ATGAGGCTCC TGCTCTGATT TCCTCCTTGC 

2300 

GGGCACCGAG ACAGAGGGTT AAATGAACCA 
GGGCACCGAG ACAGAGGGTT AAATGAACCA 
GGGCACCGAG ACAGAGGGTT AAATGAACCA 
GGGCACCGAG ACAGAGGGTT AAATGAACCA 

2350 

CCACCAGCAC TGACTCCTGG GCTCTGAAGA 
CCACCAGCAC TGACTCCTGG GCTCTGAAGA 
CCACCAGCAC TGACTCCTGG GCTCTGAAGA 
CCACCAGCAC TGACTCCTGG GCTCTGAAGA 

. 2400 

ATAAAATAAA ATTATACCTA GCAACATAAA 
ATAAAATAAA ATTATACCTA GCAACATAAA 
ATAAAATAAA ATTATACCTA GCAACATATT 
ATAAAATAAA ATTATACCTA GCAAAAAAAA 
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2401 2450 

PSP1-2 AAAAAAAAAA AA. 

PSPI7I AAAAAAAAAA AA . . , . 

PS Pi- 3 ATAGTAAAAA ATGAGGTGGG AGGGCTGGAT CTTTTCCCCC ACCAAAAGGC 
PSP1-4 AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAA 

2451 2500 

PSP1-2 

PSPI-1 

PSP1-3 TAGAGGTAAA GCTGTATCCC CCTAAACTTA G GOG AG AT AC TGGAGCTGAC 
PSP1-4 

2501 2550 

PSP1-2 

PSP1-1 

PSP1-3 CATCCTGACC TCCTATTAAA GAAAATGAGC TGCTGAAAAA AAAAAAAAAA 
PSP1-4 



2551 



PSP1-2 
PSP1-1 
PSP1-3 
PS PI- 4 
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1 MAAP RAGRGAGV7SLRAWRALGGIRWGRRPRLTPDUIALLTSGTS 4 4 

:MI Mllsl.: I |. .:| .|s 
16 LAAPAS AQLS RAG RS AP L AAGCPDRCEFARCPPQ PEHCE 54 

45 D PRARVT YGT P SLW ARLS VG VT E P RACLT S GT PGP RAQLT AV TP 8 8 

* * I I I • t • • I • It «•!•• • ••! « * 

55 GGRARDACGCCEV CGAP EGAACGLQEG PCGEGLQC WP FG VP AS A 99 

89 DTRTREASENSGTRSRAWLAVALGAGGAVLLLLWGGG RGPPAV 131 

••I |...: :.. .... :.. | | |:::: |.|. i 

100 T VRRRAQ AGLC VC AS S E P VCGS DANT YANLCQLRAAS RRSERLH RP P VI V 14 9 

132 LAAVPSPP. • . .PASPRSQYNFIADVVEKTAPAWYIEXLDRHPFLGREV 177 

I - • ::- . ' I . I I .lilllllUl.IIIM.il::: I I III 
150 LQRGACGQGQEDPNSLRHKYNFIADWEKIAPAWHIELFRKLPFSKREV 199 

178 P ISKGSGFWAADGL I VVNAH WADRRRVRVRLLSGDT YEAWTAVDFVA 227 
I : - - I I I 1 : I . - I I I I I I I I I I I . : s z I I : I I . I - I I I I :-•!!* I 

200 PVASGSGFIVSEDGLIVTNAHWTNKHRVKVELKNGATYEAKIKDVDEKA 24 9 
• • ■ • • • 

228 DIATIiRIQTKEPLPTLPLGRSADVRQGEFWAMGSPFALQNTITSGIVSS 277 
III ::l : -: .1 I . I I M I • : : I . I I I I I I : I I I I . I I I I : M I I I . 

250 DIALIKI D HQGKL PVLLLGRS S ELRPGEFWA I GS P FSLQNTVTTGI VST 299 

• • • • • 

278 AQRPARDLGLPQTNVE If IQT DAAI D FGNSGG PL VNLIX5EVIGVNTMKVTA 327 

. 1 I . : : : 1 1 I . . . : : : I I I I M I : : [ M I [ I I I I I I I 1 I II : II : I I I I 
300 TQRGGKELGLBMSDMDYIQTDAIINyGMSGGPLVNLDGEVXGIVTLKVTA 34 9 

• a • • • 

328 GrSFAIPSDRLREFLKRGEKKNSSSGISGSQRRYIGVMMLTLSPSILAEL 377 

I I I I I II I I : : I I : ..1 . . . : : I I I : . I : . I . . I II 
350 GISFAI PS DEOCKKFLTESKDRQ . AKGKA1TKKKYIGI RMMSLTSSKAKEL 398 

• . . * • 

378 QLREPSFPDVQHGVLIHKVILGS PAHRAGLRFGDVILAIGEQMVQNAEDV 427 

. I ... I I II I . . I .11 : . I I . : I I : . . I I I : . I . ; ; I I -1:11 
399 KDRHRDFP D VISGAY I TSVI PDT PAEAGGLKENDVI I S INGQS WSAND V 4 48 

• • ■» • 

428 YEAVRTQSQLAVQIRRGE^ETLTLYVTPEVTE 458 

:l I.: :|||.|.J : I. II . : 
4 49 SDVIKRESTLNNWRRG^DIMITVIPEEIO 479 
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Abstract 

Isolated nucleic acids encoding a human serine protease PSPI, protein obtainable 
from the nucleic acids, recombinant host cells transformed with the nucleic acids, 
oligonucleotides and primer pairs specific for PSPI polymorphisms and use of the protein 
and nucleic acid sequences are disclosed. 



